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PREFACE 


This report describes part of a comprehensive and continuing program of 
research concerned with advancing the state-of-the-art in remote sensing of 
the environment from aircraft and satellites. The research is being carried out 
for NASA'y Lyndon B. Johnson Space Center, Houston, Texas, by the Environmental 
Research Institute of Michigan (ERIM), formerly the Willow Run Laboratories of 
The University of Michigan. The basic objective of this multidisciplinary program 
is to develop remote sensing as a practical tool to provide the planner and 
decision-maker with extensive information quickly and economically. 

Timely information obtained by remote sensing can be important to such 
people as the farmer, the city planner, the conservationist, and others concerned 
with problems such as crop yield and disease, urban land studies and development, 
water pollution, and forest management. The scope of our program includes (1) 
extending the understanding rf basic processes; (2) discovering new applications, 
developing advanced remote-sensing systems, and improving automatic data 
processing to extract information in a useful form; and (3) assisting in data 
collection, processing, analysis, and ground-truth verification. 

The research described herein was performed under NASA Contract NAS9— 14123, 

Task V and covers the period from May 15, 1974 through March 14, 1975. Andrew 
Potter (TF3) was the NASA Contract Technical Monitor. The program x-ras directed 
by Richard R. Legault, Vice-President of ERIM and Head of the Infrared and Optics 
Division, Jon D. Erickson, Head of the Information and Analysis Department, and 
Richard F. Nalepka, Principal Investigator and Head of the Multispectral Analysis 
Section. 

Part I of this report was written by Wyman Richardson. The author gratefully 
acknowledges the helpful suggestions of James M. Gleason, Richard J. Kauth, 

Michael J. McClary, and Robert B. Crane of ERIM. Part II was written by 

James M, Gleason. The author xjishes to express his appreciation for the contribution 

of Robert B, Crane and Wyman Richardson. The ERIM number of this report is 

109600-18-F. 
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1 

SUMMARY 

The two subjects of this report are nine-point classification and 
boundary detection. 

Nine-point classification rules decide what material to assign to a pixel 
on the basis of data from that pixel and from its eight immediate neighbors. 
They are applicable whenever a pixel is likely to represent the same material 
as its neighbors, as is the case with agricultural data. The purpose of such 
rules is to gain recognition accuracy at only a slight extra cost in processing 
time. 

ni 

In the previous worx three such rules were defined, implemented and 
tested. 

LIKE9 , the nine-point maximum likelihood rule, amounts to adding, for e&ch 
material, the nine multivariate normal exponents and choosing the material with 
the smallt;^i: sum. To prevent occasional alien points from disturbing the 
decision rule, only the m smallest exponents are summed, where m is a number 
between 1 and 9. 

AVE9 , averages the nine data points and then applies the usual one-point 
rule QRULE . It is modified to delete the t largest and t smallest of the nine 
data values in each channel. 

V0TE9 , applied after QRULE decisions have been made on the nine pixels, 
assigns to the center pixel the material most frequently recognized among the 
nine pixels* 

These three nine-point rules substantially outperformed QRULE on field 
interiors, but maps made by them showed distortion and loss of detail on the 
boundaries. They are based on the rigid premise that all nine pixels represent 
the same material and they performed clumsily when the premise failed. 


[1] W. Richardson, A Study of Some Nine-Element Decision Rules, Technical 
Report, 190100-32-T, Environmental Research Institute of Michigan, 
Ann Arbor, Michigan 1974. 
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To reach the goal of developing nine-point rules that are accurate on 
boundaries, and thereby suitable for satellite data with its high frequency 
of boundary points, two nine-point rules were derived from assumptions less 
rigid than neighborhood conformity. 

BAYES 9 is based on the assumption that a pixel probably represents the 
same material as its neighbor, the degree of dependence specified by a 
parameter 9 between 0 (independence) and 1 (complete dependence) . 

PRI0R9 makes a Bayesian decision on the center pixel based on prior 
probabilities estimated from neighborhood data values. The estimated prior 
probability of a material is the aver-age, over nine pixels, of the posterior 
probability of that material at each pixel. 

PREF9 uses as its decision criterion the estimated prior probability just 
defined for PRI0R9. It is conceptually an improved voting rule that takes 
account of all the information at each pixel rather than just a vote for the 
winning material. 

In preliminary tests an aircraft data, BAYES9 with 6 = .3, .5, .7, and ,9 
performed as well on field interiors as the three previously-studied rules 
and was considerably more sensitive to fine detail on the boundaries. For 
0 = .1, the field interior results were nearly as good. The maps improved in 
fidelity to known boundaries as 0 decreased from .9 to .1. 

PRI0R9 made an excellent map of boundary areas, but its improvement over 
QRULE in field interiors was only half that of the best nine-point rules. 

PREF9 ranked with the best on field interiors - a little better than the 
other voting rule V0TE9 - but resembled LIKE9 and AVE9 in its clumsy represen- 
tation of boundary areas, 

A null test (i.e., a means of deciding "none of these") was included in 
each of the new rules by defining a null category of all materials not associated 
with a specific distribution and giving it a flat distribution of height e. The 
effectiveness of tbi method of deciding null was verified by the good maps made 
by BAYES9 and PRI0R9, 
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A three-stage plan fo>: testing nine-point rules on LAMDSAT data from the 
LACIE experiment is presented. The first stage is a test of all nire-point 
rules on two LACIE intensive study sites. The measure of comparison will be 
the percentage of interior points from ground- inspected fields that are correctly 
classified. On the basis of the first-stage and aircraft data results, a small 
number of rules will be chosen for further testing. The second stage will be 
the same as the first, but with fewer rules and more sites. The third stage 
will be a comparison of second stage rules as acreage estimators. 

The basic thresholded gradient and some modifications to it, have been 
tested and evaluated as a means of boundary point detection. The computational 
efficiency of those methods is particularly appealing. The modifications were 
developed to utilise some characteristics of the boundaries not utilized by 
the basic technique. 

The hypothesis testing techniques developed by LARS, Purdue have been 
tested and evaluated as a means of closed boundary formation. Two methods have 
actually been developed, one using first-order statistics only and the other 
also using second-order statistics. These techniques were investigated to 
gain a better insight into the closed boundary formation problem and also to 
better ascertain their performance. 

The performance of the gradient methods was for the most part unsatisfactory. 
These methods detected many true boundary points but also detected too many 
false boundary points. They are effective as an easily implemented means of 
boundary enhancement for visual examination. However, difficulties in choosing 
a proper threshold and a tendency to emphasize randomly occurring variations, 
are two major problems which all of these techniques suffer from. 

The performance of the hypothesis testing techniques is more difficult to 
categorize. The most significant aspect of these techniques is the field 
building algorithm which guarantees closed boundaries. The method using first- 
order statistics only performed better than that using second-order statistics 
on the data sets which were tested. The results of both methods were significantly 
poorer for the satellite data set than for the aircraft data set which were tested. 
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After testing the gradient and hypothesis testing techniques, it was 
decided that the boundary detection problem must be approached in a thorough and 
sophisticated manner. The approach which was adopted is to formulate the 
boundary detection problem in increasingly complex steps and perform a vigorious 
mathematical analysis of each. The formulations of the problem will contain 
the atmospheric and sensor system effects on the data. Different boundary 
features and assumptions will also be investigated as part of these different 
formulations. Statistical signal detection and parameter estimation methods 
will be used for the analysis. 

Only one basic formulation of the problem has been investigated. The 
solution procedure which was derived is easily implemented and is also optimal 
under certain assumptions. The analysis of this formulation is indicative of 
the power of this approach. 
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PART I 

NINE-POINT CLASSIFICATION 
Wyman Richardson 

2 

INTRODUCTION 


Multispectral classification rules in current use are based on Information 
from one pixel at a time. The objective of this investigation is to increase 
the accuracy of multispectral recognition by developing classification rules 
that use data from groups of pixels. 

The research has focused on "nine-point” rules (i.e. , those which take 
into account* when classifying a pixel, data from the eight surrounding 
pixels) because they offer hope- of satisfying the following requirements; 

1) more accurate than purely spectral rules, 2) practical with respect to 
execution speed and computer storage, 3) preserve as much resolution as 
possible, and 4) suited to agricultural surveys such as LACIE. 

Three such rules were implemented in the previous contract and tested on 

[11 

one data set with encouraging results . They are all "nine-point" rules, 
t lat is to say they use data from the 3x3 grid formed by the pixel in 
question and its eight immediate neighbors. 

LIKE9, the nine-point likelihood rule, is the maximum likelihood decision 
rule derived from the assumption that the nine elements are an independent 
random sample from a multivariate normal distribution. It amounts to adding, 
for each material, the nine multivariate normal exponents and then choosing 
the material with the smallest sum. To prevent occasional alien points from 
disturbing the decision rule, we have modified it to sum onj-y the m smallest 
exponents, where m = 1,,,,,9. 

AVE9, the trimmed mean rule, averages the nine data points and then 
applies the one-point (i.e., the usual) rule. To lessen its sensitivity to 
alien points we have deleted the t largest and the t smallest values of the 
nine in each channel , where t = 0 , . . . , 4 . 

[1] W. Richardson, A Study of Some Nine-Element Decision Rules, Technical 

Report 190100-32-T, Environmental Research Institute of Michigan, 

Ann Arbor, Michigan 1974. 
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V0TE9, the voting rule, is applied after one-point decisions have been made 
on the nine pixels. It assigns to the center pixel the material most frequently 
recognized among the nine pixels. In case of a tie, the one-point decision on 
the center pixel is used. 

To compare and rank these three rules and the usual one-point rule QRIILE, 
we ran a quantitative test by counting the number of points misclassified within 
each of 42 field interiors in the Imperial Valley, California. A result of 
the test was the following best-to-worst ranking of rule performance: LIKE9 with 
m = 9, V0TE9, AVE9 with t 0, AVE9 with t = 0, the one-point rule QRULE, 

LIKE9 with m ~ 1. The performance of LIKE9 improved steadily as m went from 
1 to 9. For m = 9, its error rate was about one-half that of the one-point 
rule on the training sets, and on the test sets about three-fourths that of the 
one- point rule. 

Qualitative comparison of the rules was made by using each to generate 
a map of a stretch of Imperial Valley data. Each rule was programmed with an 
option to allow a decision against all the alternative materials and to display 
such decisions by leaving blanks on the map. Such a "null test" creates a white 
framework of roads, rivers, and other extraneous materials against which materials 
of interest stand out. Large numbers of isolated recognitions made the one- 
point map difficult to read, but they were mostly removed by the nine-point 
rules. Incorrect classifications on the nine-point maps tended to occur in large 
patches. Fine detail on the one-point map, such as small roads, were lost or 
distorted on the nine-point maps. 

LIKE9 and AVE9 have in common the premise that all nine pixels in the 
neighborhood represent the same material. Such rules would be expected to do 
well in field interiors where the premise holds and poorly on the boundaries 
where it doesn't. It is important that rules do well in boundary areas so that 
they will be effective on satellite data which contains a high proportion of 
boundary points. For this reason, the present study has been concerned with 
developing and testing nine-point rules based on assumptions less rigid than 
neighborhood conformity. In section 3, two such rules are derived and a third, 
arising naturally from the development of the second, defined. In section 4, 
these three rules are subjected to preliminary tests and compared in performance 
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with the previously-studied rtiles and the one-point rule. Section " reports 
the conclusions from these tests and reconmends a plan for testing nine-point 
rules on LAMJSAT data from the LACIE study. A glossary is provided at the 
end of the report to allow the reader to keep track of special names and 
symbols . 


c 
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3 

THREE NEW KIKE-POINT RULES 

3.1 BAYES9, A RULE BASED ON PARTIAL DEPENDENCE OF NEIGHBORING PIXELS 

The classification rules previously studied have been based on one of two 
extreme assumptions: 

1) The distribution of the data vector X is independent of the estimates 
of materials in neighboring pixels. This is the assumption underlying 
the usual classification rule. 

2) The pixel and its neighbors represent the same material. This is 
the assumption on which the previously-studied nine-point rules are 
based. 

The nine-point rules based on the second assumption did better than the one-point 
rule on field interiors but tended to clobber the boundary areas . The problem 
in the boundary areas seemed to be that the nine-point rules were sensitive to 
violations of the asstnnption on which they are based, as shown by their tendency 
to put a 3 X 3 white rectangle on a classification map whenever they encounter 
an odd data value. To correct such anomalies, a fix was programmed in the 
rules to omit the oddest point from the decision criterion. This takes care of 
the case where one point is non-conforming but whan if, in a boundary area, 
several are? 

A more fundamental reform is to return to the original intent of the 
nine-point rule and allow neighboring data values to influence but not dominate 
the center pixel decision. What is needed, in other words, is a rule based on 
an assumption flexible enough to tolerate non- conforming points in boundary 
areas - of special concern because of the goal of applying nine-point rules 
to satellite data in which boundary points are plentiful. 

In this section, two rules BAYES9 and PRI0R9 based on flexible assumptions 
are derived. A third rule PREF9 based on the rigid assumption 2) is suggested 
by the derivation of PRI0R9. 

BAYES9 is based on the assumption that a pixel probably represents the 
same material as neighboring pixels, a far more flexible assumption than the 
rigid requirement that it certainly conform. The flexible assumption, along 
with a technical assumption and Bayesian decision theory, are sufficient 
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to derive the decision rule BAYES9.* The derivation is given as Appendix 

A. 

The rule depends on a constant 9 that describes the degree of dependence 
between a pixel and its neighbor. The extreme values that 0 can take are 
0 for independence (assumption 1 above) and 1 for complete dependence 
(assumption 2) . 

To define 6 more precisely, we let I and J be neighboring pixels. We 
assume prior probabilities P(I=a), where a is a typical material the pixel 
might represent and P stands for the operation '’probability that". The 
conditional probability that I = a given that J = a, written symbolically 
P(I=a|j=a), is P(I=a) under an assumption of independence and is 1 under an 
assumption of complete dependence. The BAYES9 rule assumes an average of these 
two extreme values: 

P(I=a|j=a) = d-0)P(I^a) + 0 • 1 

When a and b are different materials, P(I=a|j=b) = P(I=a) under an 
assumption of independence but is 0 under an assumption of complete dependence. 
BAYES9 assumes the average of these two values ; 

P(I=a|j=b) = (l“9)P(I=a) +9*0 

To summarize! 

P(I=a|j=a) = (l“0)P(I=a) + 0 
P(I=a 

It is shown in Appendix A that this definition is consistent with the laws 
of probability. 

When the prior probabilites are equal, the decision criterion is shown in 
Appendix A to be 

P(Xg|a) n [P(X.|a) + ^ 
i=l 

*J, Gleason and M-. McClary of ERIM originated the basic idea for this rule. 



J=b) = (l-0)P(I=a) 
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where 

is the data value at the center pixel 
, . . . » ^ are the data values at the neighboring pixels 
P(3i|a) is the probability density of given material a 
k is the number of materials 

When 6=1, the second term drops out and the decision criterion is the 
product of the nine multivariate normal densities. This expression can be 
computed by summing the nine exponents and then exponentiating, so it is 
equivalent to the LIKE9 criterion. When 6 0, the expression in the square 

brackets is asymptotic to the second term, which is the same for all materials, 
and thus the criterion reduces, essentially to PCX^ja), which is the criterion 
of the usual classification rule. 

A suitable value of 6 can be obtained from an empirical estimate p that 
neighboring pixels represent the same material, calculated by counting, on a 
recognition map, the number of pairs of neighboring pixels representing the 
same material and dividing by the total number of pairs. 

Another estimate of p could be obtained from a geometrical probability 
calculation based on estimates of average field size and boundary width. 

In the case of equal prior probabilities, 6 would be obtained from p by 
solving 

(1-0) 1/k + e = p 

where 6 is the BAYES9 parameter 

k is the number of materials 

p is the probability that two neighboring pixels represent 
the same material. 

This relationship between 0, k and p is illustrated by Table 1. 

TABLE 1. TABLE OF p FOR VARIOUS VALUES OF 6 AND k. 


0 = 

= .1 

.3 

.5 

.7 

.9 

k = 2 

.55 

.65 

.75 

.85 

.95 

3 

.40 

.53 

.67 

.80 

.93 

4 

.33 

.48 

.63 

.78 

.93 

6 

.25 

.42 

.58 

.75 

.92 

8 

.21 

.39 

.56 

.74 

.91 

10 

.19 

.37 

.55 

.73 

.91 
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The apparently cumbersome and time-consuming decision criterion (1) can 
actually be programmed to run quite rapidly. (The details are given in 
Appendix B), In a timing run on 10-channel data, 200 points per line, using 
a subset of six channels, the BAYESS rule took only 11 % longer than the 
usual maximum likelihood classification rule. 

We run our classification rules with a null test (i.e., a test whether to 
decide "none of these") to allow for the existence in the scene of materials 
for which we don't have signatures. At first, with no clear rationale in mind, 
we used the test "whenever criterion (1) falls below a prescribed value, decide 
null". Maps made using this null test showed the same insensitivity to boundary 
detail and the same tendency to be splotched with 3x3 white rectangles as those 
made by LIKES. 

To correct this failing, a version BAYESS of BAYES9 was programmed 
that left out the smallest square bracket factor in the BAYESS decision 
crtterton: g ^ .g, 

P(X |a) n [PCX la) i— 1 (1) 

i“l b 

BAYES 9 put out the number of the material with the smallest value of criterion 
(1) as channel 1 and the value itself, appropriately scaled, as channel 2 for 
use in the null test. A noisy pixel would have small values P(X^jb) 

for all b. Thus the i^'^ square bracket term would be small and therefore 
criterion (1) would be small, signalling a null decision. The effect would 
occur whenever the pixel were contained in the 3x3 neighborhood. Thus, 
a noisy pixel mapped by BAYES9 and its null test would produce a 3 x 3 white 
rectangle on the map. 

When BAYES8 was defined to omit the smallest square bracket factor in 
criterion (1) , an isolated noisy pixel triggered the null test only when it 
was the center pixel of the neighborhood and thus did not produce a white 
rectangle. Maps made by BAYESS show the white rectangle program largely cured, 
but remain insensitive to fine detail on the boundaries, as evidenced 
by the omission and distortion of small boundaries observable on aerial 
photographs. Moreover, there was a slight loss of accuracy on field interiors 
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(see section 4) * BAYESS takes longer to run than BAYES9 - 16% longer than 
QE.ULE rather than 11% longer, 

BAYES9 in theory should do better in the boundary areas than completely 
dependent nine-point rules. Because it did not, we looked to a poorly-defined 
null test as a likely culprit. Instead of blindly defining a null test by 
decision criterion (1) we derived one logically as follows.* 

Whatever materials appear in the scene but are not given a specific 

distribution are lumped together in a null category, N, which is assumed to 
be distributed with a flat density of height E. To simplify calculation and 
application, we assume that all materials and M have the same prior probability 
l/(kil). The BAYES9 criterion for each non-null material a is 



® “ ■few)' 

for the null material is 


8 

n [PCX |a) + s };p(x jb)i 

m J* 1 


2 P(X.|b) includes the null density £. 
b ^ 


( 1 ) 


The criterion 


8 

e n [E + S j; PCX |b)l (2) 

i=l b 


The decision procedure is to choose the material with the largest criterion 
Cl) when this criterion exceeds the null criterion (, 2 ) and to choose null 
otherwise. 

This definition of the null decision has two inconveniences. One is that 
the level E of the null test must be set before the decision rule is run, 
whereas the null level of previous rules could be changed after the decision run, 
and so allow a map with too many or too few blank spaces to be rerun at a 


*This rationale for defining a null test was suggested by R.J. Kauth of ERIM. 
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different level without redoing the decision run. This inconvenience can be 
remedied by defining two e’s. The first e, used inside the square bracket 
factors, is set once and never changed. The e outside the product sign, which 
we will call adjusted later by putting the null test in the form 

"decide null if 

winning criterion (1) ^ , u 

criterion (2) 

i.e. , if 

winning criterion (1) ^ 

8 ^2 
n [e+ s I P(xjb)] 
i=l b 

The left side, appropriately scaled, is stored on the processing output tape 
as channel 2 (the decision between materials is channel 1) and is set as a 
control variable in the mapping or tallying module. 

The other inconvenience is that it is not easy to estimate what 
value of e to use. To solve a simpler problem first, let us find the 
density height e that will reject one legitimate point in a thousand 
from a multivariate normal distribution with mean u and covariance 
matrix R. The multivariate normal density is 

- -|[(X-u)V^(X-y) + log^lRj] 

c e 

> 

Because c is the same for all materials, it will not be calculated for 

any of the material densities, and will therefore be omitted from further 

T “1 

discussion. The quantity Z =(X-y) R (X-y) has a chi-square distribution. 
When Z is constant, the density is constant. Suppose we have four 
channels. From a table of the chi-square distribution we find that 
the probability that Z > 18.465 is .001. Let e be the density when 
Z = 18.465. The probability of getting a data point X with a smaller 
density than e is .001. Thus 

- I (18.465 +' log^lRj) 

£ = e 

Many signatures will produce many values |R^!- We don’t at the 
moment have clear advice about whether to choose the biggest |R^|, the 
smallest |R^j or the average |r^ 1 for the calculation of e. Experience 
will have to be the guide. 
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The null test we have defined is to choose null when 


PCX |a) n [P(xja)+S I PCX lb)] 

° i=l ^ b ^ e, 

8 

n [e + s I PCxJb)] 
i=i b 


i.e., when 


8 

PCX^la) n 
° i=l 


PCxJa) + S i PCxJb) 
^ b 

£ + s I PCX ]b) 
b 


C3) 


For the purpose of choosing among the non-null materials, criterionCS) 
is equivalent to criterionCl) because it is criterionCl) divided by an expressipn 
that is the same for all materials* The only additional computation required 
is one floating add per pixel and one floating divide per channel. And 
criterion C3) is the expression compared with a prescribed constant £2 
null test. This test can be applied when making a map or when tallying 
the number of recognitions of each material. 

BAYES9 with the better null test produced maps far more faithful to known 
boundaries than those of the completely dependent nine point rules. Moreover, 
fidelity increased, as it should, when 0 decreased from .9 to .1 Csee section 4.) 

The decision rule is slightly changed by the improved null test because 

of the presence of £ in J|PCX.|b). It should be a change for the better. When 

b "" 

the data X. from a neighboring pixel really represent material a, then PCX, ja) 
i •** 

is large compared to e and the presence of e in J PCX. |b) makes virtually no 

b 

difference. But if X^ is an odd data value, all PCX^|b)’s are much smaller than 
e and the square bracket factor is approximately 3/ Cl+S) for every material. 

Thus an odd point in the neighborhood would not make an unreliable contribution 
to criterion C3) . In practice, results on field interiors Csee section 4) are 
nearly identical for values of s from 0 up to a value corresponding to a 
rejection level of .01. 
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3.2 PRI0R9, A BAYESIAN RULE BASED ON PRIOR PROBABILITIES DERIVED FROM 
NEIGHBORING DATA VALUES 

The BAYES9 procedure of postulating a degree 6 of dependence between 
neighboring pixels is one way to classify pixels with due regard for the 
influence but not domination of neighboring pixels. Another way is to use the 
data values of the 3x3 neighborhood to set prior probabilities for the 
decision on the center pixel. Such a procedure would embody the principle that 
the prior likelihood that a pixel represents a given material is dependent 
on the neighborhood in which the pixel is located. The classification rule 
PRI0R9 carries out this priticiple.* 

Let the center pixel be I^ and the eight neighboring pixels 

Let the corresponding data values be X We assume 

i o o o 

that the different materials a, b, c have global prior probabilities 

P(a), P(b)» P(c). These can be set equal in the absence of other 

information. When the material, say a, is known, then the data values 

have known distributions, P(X la) , . . . ,P(X„ |a) , often assumed to be 

o o 

multivariate normal. By Bayes' formula, the posterior probability that 
a pixel I^ represents material a giveti the data value X^ is 

P(a) PCX Ja) 

P(a)x.) = 

^ I P(L) PCX |jb) 

Thus after looking at the data from the nine pixels, we have nine 
opinions PCa|x^) , . . . ,PCa|Xg) of the probability that a pixel in the 
neighborhood is material a. They are consolidated into a single opinion 
by answering the question "What is the probability that if we pick a 
pixel at random from the neighborhood, it is material a?" This single 
number, which we take as the local prior probability of a when 
classifying the center pixel, is 

|pCajx^)+ I PCajX^) + .... PCajXg) 


*This decision rule was designed by J. Gleason and W. Richardson 
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Thus the decision criterion, writing out the Bayes' formulation of 
the posterior probabilities. Is 




P(a)P(X^|a) 


.•h 


P(a)P(Xg|a) 

^P(l>)P(Xgl35) 


(5) 


(The factor, 1/9 is omitted because it is the same for all materials,) 

This rule generalizes very simply to larger neighborhoods such as 
5x5 and 7x7. If the whole scene were used as the neighborhood, we would 
nave the one-point rule with two passes through the data to more accurately 
estimate the prior probabilities. 

A purist might object to using the data X^ twice, once in the 
calculation of the local priors and again in the application of the 
decision rule. Two replies to this objection are; 

1) Even if this rule were applied to a neighborhood of size one, 
the decision rule would be equivalent to the one-point rule 
because the material with the biggest weighted 

likelihood would get the biggest weight for the second 
application of the decision rule and would increase its lead 
over its rivals. 

2) X^ must be represented in the calculation of the local prior 
probabilities to allow for the possibility that the center 
pixel is unlike its neighbors. If it were a small pond, for 
example, and the posterior probability of pond were essentially 
zero for the neighbors, the center pixel would have to participate 
in the prior probability to make possible a correct decision. 

Because in practice many pixels do not represent a material searched 
for, the capability of deciding null (i.e., "none of these") is required 
for the accurate operation of the decision rule we have described. 
Otherwise, for such pixels meaningless posterior probabilities would 
be calculated and make unreliable contributions to the estimates of 
prior probabilities. 
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Our null test for PRI0R9 is based on the same principle as the one for 
BAYESg (section 3.1), We define an additional category, the null category N, 
which has a flat distribution of height £. W is chosen whenever criterion 
(5) for N 


e 


[ g(N) 

P(N) e + iP(b)P(X |b) 


+ . . .+ 


P(N) 

P(N)e + [p(b)P(Xg|b) 


( 6 ) 


is the largest. The prior probabilities are adjusted so that all of them 
including P(N) add up to 1. 

The posterior probability of any material a 

P(a) P(X |a) 

P(alx.) = ; , 

^ eP(N)-i- I P(b)P(X^|b) 

• all 

materials b 


This posterior probability goes to zero for those pixels where all the 
material likelihoods are very small and so the unreliability of estimating 
the local prior probabilities is removed. 

To simplify calculation and application, we assume that all materials 
and null have the same prior probability 1/Ckr!*l) where k is the number of 
materials. Under these asstimptions , the PRI0R9 criterion for each material 
a is 

P(X^|a) P(X„|a) 

P(X la) [ ^ +...+ ““2 ] ( 7 ) 

e + Ip(X |b) e+ IP(Xo|b) 

b b 


and for the null category is 

e 


[ 


e + 


lb) 

b ° 


+ . , ,+ 


E + 


IP(Xj,lb) 
b ^ 


(8) 
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The null test is to decide null if 

winning criterion (7 ) 
criterion (8) 


< 1 


l.e., if 


P(xja) 

P{X.|a) [ 


e + I PCX_^|b) 

b 


+ ... + 


P(Xgia) 


B + l P(X |b) 
b 


(9) 


e + Z P(X^|b) 


+ . . .+ 


e + I P(Xglb) 


The e on the right side has been called because it can be set at 
map-making time to regulate the amount of white space on the map, whereas 
e must be set before applying PEI0R9 to the data values. The criterion can be 
thought of as the product of the center likelihood with a weighted average of 
the nine likelihoods. The weights, however, change from pixel to pixel. 

PRI0R9 has been programmed as a processing module (Appendix C) and 
subjected to a preliminary test on aircraft data (section 4). Like BAYES9 it 
runs rapidly, taking only 12% longer than the usual one-point rule on a timing 
run on 10-channel data, 200 points per line, a subset of six channels used. 

3.3 PREF9, AN IMPROVED VOTING RULE 

The local prior probability of each material a 
P(a)P(X |a) P(a)P(X„|a) 

2 , ,+ 2 (10) 

IP(b)P(X |b) ■ lP(b)P(X lb) 

b b 

that was used by PRI0R9 in a Bayesian decision on the center pixel can be used 
alone as the criterion for a nine-point rule PREF9. The rule classifies the 
center pixel as the material giving the biggest value ir answer to the question, 
"what is the probability that if you choose a pixel at random among the nine, 
it is material a?" The rule is similar in concept to the voting rule V0TE9 
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(a rule classifying the center pixel as the one with the most first place 
votes from the nine pixels) but bids fair to be more accurate because it 
uses all the information at each pixel and not just a vote for the winning 
material. 

A null test for PREF9 is constructed by the same rationale as the one for 
BAYES9 and PRI0R9. We define a null category N having a flat distribution of 
height E. N is chosen whenever criterion (10) for N 

+...+ (11) 

P(N)e + IP(b)P(X |b) P(N)E + j;P(b)P(Xg|b) 

b ° b 

exceeds criterion (10) for every material. Let us assume all the materials and 
N have equal prior probabilities l/(k+l). The test for null can be put in 
the form 

winning criterion (10) ^ ^ 

criterion (11) 


which, when written in detail, is 


P(X^ a) 




P(Xg a) 


e+ I?(X Jb) 


e + IP(X„[b) 


< 1 


e+ IP(x^tb) 
b 


+ ... + 


e+ IP(X„|b) 


in other words 


P(XJa) 


+...+ 


P(Xgia) 


e+ Ip(X |b) E+ IP(Xolb) 


< e. 


+ . . . + 


e + IP(X^ib) £ H- IP(Xolb) 
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Theoretically* Eg prescribed null test level e* But because £2 

changed at map-making time, the practical course is to set e as best we can, 
run the decision rule PREF9, and then adjust £2 produce a suitable amount 
of white space on the map. 

Because the BAYESS and PRIORS decisions are primarily controlled by the 
center pixel values, we would expect these rules to be more sensitive to fine 
detail than PREFS, for which the criterion is a function of all nine points of 
the neighborhood without particular regard for the center pixel. For this 
reason, of all the nine-point rules so far defined, PRIORS and BAYESS have the 
best hona of attaining the high resolution critical for processing LAMDSAT 
data. We would expect all three rules to be relatively insensitive to the 
presence of noisy pixels in the neighborhood because very small likelihoods 
are dominated by G in the decision criteria. 

The calculations to carry out PREFS are nearly all required for PRIORS 
(see Appendix C) . For this reason both rules are implemented by the same 
processing module and an imput switch determines whether one rule or the other 
or both will be used. The timing run for PRIORS, mentioned earlier, that took 
12% longer than the one-point rule was actually a run of PRIORS and PREFS together. 
A neglible amount of time would have been saved by running one of them alone. 
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4 

PRELBIINARY TESTS OF NINE-POINT RULES 


In this section, we report a preliminary test of th£ new rules 
BAYES9, PRIORS and PEEF9 and a test of BAYES9 and the previously-studied 
rules LIKE9, AVE9 and VOTES on a new data set. 

The three nine-point decision rules introduced in this report, 

BAYES9, PRI0R9 and PREF9, were given preliminary tests on aircraft data 
collected from the Imperial Valley, California. This data set was used 
because 1) we have confidence in the ground truth ** '* 172]^ 

2) the crops arenot easy to distinguish, so that differences in performance 
of the rules may be displayed, 3) we have identified 42 fields for which 
the ground truth is unequivocal and the scan angle minimal, a number of 
replications sufficient for .experimental comparison, 4) the previously- 
studied rules were tested on this data set and ths results are available 
for comparison, 5) the field boundaries are easily-recognizable lines 
verified by aerial photographs, so that we can observe on maps made by 
the rules how well the rules are doing on the boundaries. The next stage 
of testing will be on the LACIE intensive study site data which was just 
becoming available when this report was written. 

The first test was on field interiors from the 42 fields. Signatures 
for alfalfa, barley, lettuce, rye, bare soil, sugar beets and saffalowers 
were computed from a set SIGS 1 of 20 training fields and then from 
another, disjoint set of 20 training fields SIGS 2, Two runs were made 
using SIGS 1 and SIGS2, respectively. The per cent misclassif ied by the 
rules tested is reported in Table 2. Results for LIKE9 with in-9, AVE9 
with t=l and V0TE9 are included for a comparison. 


[11 

W. Richardson, A Study of Some Nine-Element Decision Rules, Technical 
Report 190100-32-T, Environmental Research Institute of Michigan, 

Ann Arbor, Michigan, 1974. 

F21 

R.F. Nalepka, Investigation of Multispectral Discrimination Techniques 
Technical Report 2264, 12-F, Willow Run Laboratories, Ann Arbor, > 
Michigan, ■ 1970, 
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The "level" associated with BAYESS, PRIORS and PREPS is the percentage 
of a typical material distribution where the density is less than the flat 
density of height e in the null test (see section 3.1). To run one of 
these rules with a level of .001, say, you look up the number in a chi- 
square table at the ,001 column and row 6 (if there were six original data 
channels), obtaining the number 22.457. You get the average log |R. j, 
say 2.57, where R^ is the covariance matrix of material i, and include in 
the input the statements 

EXPLIM = 22,457, LOGR = 2.57 

Log^ |r^| is included in the output of the preceding module QRULE, The 
nine-point module computes 

- (EXPLIM + LOGR) 

e = e 

A level of □ corresponds to an s of 0. The BAYESS results for e=0 
were obtained by an early version of the module that is logically equivalent 
to the current version when e=0 and the null test isn’t used (as it wasn't 
in the field center experiment) . 

BAYESS and the three previously-studied rules LIKES, AVE9, and VOTES 
were tested on field interiors from four segments of the Corn Blight 
Watch Experiment in Indiana, 1S71. A total of 225 fields were included. 
Seven materials vere recognised: corn, soybeans, trees, pasture, hay, 

growing hay, and bare soil. For each material, every other field was 
selected as a trainitg field in the order the fields appeared on the tape. 
This sampling scheme provided a class of training fields large enough to 
achieve meaningful training field tests of the decision rules. The choice 
of every other field avoided the possible inference that a peculiar 
grouping of training fields affected the results. 

A summary of the results is given in Tables 3-5, Imperial Valley 

[ 1 ] 

results reported jn Table 1 and are included for comparison. The error 
rats reported is an average of the error rates of the individual fields. 


^^^W. Richardson, A Study of Some Nine-Element Decision Rules, Technical 
Report 19D100-32-T, Environmental Research Institute of Michigan, 

Ann Arbor, Michigan, 1974. 
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TABLE 2. PERCENT MISCLASSIFIED BY VARIOUS DECISION RULES 

ON 42 FIELDS FROM THE IMPERIAL VALLEY USING TWO 
DIFFERENT SELECTIONS, SIGS 1 AND SIGS 2, OF 
TRAINING FIELDS 




20 Training Fields 

22 Test 

Fields 

Decision Rule 


SIGS 1 

SIGS 2 

SIGS 1 

SIGS : 

QRULE (usual rule) 


20.3 

21.8 

31.6 

32.8 

PRI0R9, level = .001 


16.4 

17.1 

28.9 

29.0 

PREF9, level = .001 


11.9 

12.2 

25.2 

24.6 

V0TE9 


13.5 

14.2 

26.8 

26.7 

BAYES9, level = 0, 9 = 

.1 

13.7 

14.0 

27.2 

25.7 

9 = 

.3 

12.3 

12.2 

25.6 

24.4 

9 = 

.5 - 

11.9 

11.7 

24.8 

23.9 

9 = 

.7 

11.7 

11.4 

24.5 

23.6 

9 = 

.9 

11.6 

11.1 

24.3 

23.4 

BAYES9, 9 = .5, level = 

0* 

11.85 

11.70 

24.81 

23.90 

level = 

.001 

11.87 

11.61 

24.75 

23.93 

level = 

.01 

11.81 

11.62 

24.76 

24.07 

level = 

.1 

11.98 

12.16 

24.87 

24.69 

BAYESS, 6 = .5 


12.6 

13.0 

25.4 

25.0 

LIKE9, m = 9 


11.7 

11.4 

24.8 

23.5 

AVE9, t = 1 


14.1 

17.6 

26.2 

28.9 


A 

This line is repeated for ready comparison with the other levels. 
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TABLE 3. PERCENT MISCLASSIFIED BY FIVE TYPES OF DECISION RULES 
ON TRAINING FIELDS FROM THE IMPERIAL VALLEY (TWO SETS 
OF SIGNATURES) AND FROM FOUR SEGMENTS OF THE CORN 
BLIGHT WATCH EXPERIMENT. 


Imperial Valley Corn Blight Watch 




SIGS 1 

SIGS 2 

S203 

S204 

S212 

S227 

Number 

of Fields: 

20 

20 

17 

'34 

'27 

30 

QRULE 

(Usual Rule) 

20.3 

21.8 

5.6 

4.7 

7.0 

5.8 

LIKE9, 

m = 1 

20.0 

26.1 

4.2 

3.1 

6.9 

4.9 


m = 3 

16.1 

21.9 

3.2 

2.7 

6.1 

4.8 


m = 5 

14.1 

17.6 

2.8 

2.7 

5.3 

5.2 


m = 7 

12.7 

13.3 

2.8 

2.7 

5.0 

4.3 


m = 8 

12.0 

12,3 

3.1 

2.8 

4.7 

4.5 


m = 9 

11.7 

11.4 

3.2 

3.0 

4.7 

4.4 

AVE9, 

t = 0 

15.5 

18.9 

3.3 

2.9 

5,5 

4.4 


t = 1 

14.1 

17.6 

3.3 

2.9 

5.5 

3.9 


t = 3 

13.7 

17.6 

3.0 

3.0 

5.3 

4.4 

V0TE9 


13.5 

14.2 

2.7 

3.1 

5.2 

4.2 

BAYE9, 

level = 0 








0 = .1 

13.7 

14.0 

3.5 

3.5 

5.5 

4.9 


e = .3 

12.3 

12.2 

3.1 

3.2 

5.0 

4.1 


e = .5 

11.9 

11.7 

3.0 

3.2 

4.9 

4.0 


e = .7 

11.7 

11.4 

2.9 

3.1 

4.8 

4.0 


e = .9 

11.6 

11.1 

2.9 

3.0 

4.8 

3.7 

BAYESS 

,0 = .5 

12.6 

13.0 

3.1 

3.1 

5.1 

4.5 
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TABLE 4. PERCENT MISCLASSIFIED BY FIVE TYPES OF DECISION RULES 
ON TEST FIELDS FROM THE IMPERIAL VALLEY (TWO SETS OF 
SIGNATURES) AND FROM FOUR SEGMENTS OF THE CORK BLIGHT 
WATCH EXPERIMENT, 


Imperial Valley Corn Blight Watch 




SIGS 1 

SIGS 2 

S203 

S204 

S212 

S227 

Number 

of Fields; 

22 

22 

17 

36 

30 

34 

QRULE 

(Usual Rule) 

31,6 

32,8 

10.5 

10.1 

11.5 

15.7 

LIKE9, 

m = 1 

34,1 

35.7 

10.5 

7.0 

9.0 

14.9 


m = 3 

30,0 

32.1 

9,1 

7.3 

8.4 

13.9 


m = 5 

27.3 

28.8 

8.8 

7.9 

8.4 

13,5 


m ^ 7 

25,5 

25,6 

8.7 

8.7 

8.9 

13,3 


m ~ 8 

25.0 

24.1 

8.8 

9.1 

9.1 

13.4 


ra = 9 

24,8 

23.5 

9.1 

9.6 

9,3 

12.8 

AVE9, 

t = 0 

27.5 

29.9 

9.9 

9.5 

9.6 

13.2 


t = 1 

26.2 

28.9 

9.8 

9.4 

9.5 

13,5 


t = 3 

26.2 

28.4 

9.6 

9.3 

9.3 

13.7 

VOTE 9 


26,8 

26.7 

8.4 

8.7 

9.1 

13.8 

BAYE9, 

level = 0 








6 - .1 

27.2 

25.7 

9.3 

9.2 

9.4 

14.1 


0 = .3 

25.6 

24.4 

8.7 

9.0 

8.9 

13.2 


9 = .5 

24.8 

23,9 

8.6 

8.9 

8.8 

13.1 


0 = .7 

24.5 

23.6 

8.5 

8.9 

8.7 

13.0 


9 = .9 

24.3 

23.4 

8.4 

8.9 

8.6 

12.8 

BAYESa 

,0 - ,5 

25.4 

25.0 

8.7 

8.8 

8.7 

13.4 
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TABLE 5. ERROR RATES FOR FIVE 

TYPES OF BECISION RULES 

AVERAGED 

OVER FOUR SEGMENTS OF THE CORN BLIGHT WATCH 

i 

EXPERIMENT 



Training Fields 

Test Fields 

QRULE (Usual Rule) 

5.77 

11.96 

LIKE9, m 

= 1 

4.78 

10.32 

m 

= 3 

4.19 

9.66 

m 

= 5 

3.98 

9.67 

m 

= 7 

3.72 

9.89 

m 

= 8 

3.80 

10.08 

m 

= 9 

3.81 

10.20 

AVE9, t 

= 0 

4.03 

10.53 

t 

^ 1 

3.88 

10.55 

t 

= 3 

3.94 

10.47 

V0TE9 


3.81 

10.00 

BAYES 9, 

level = 0 



e 

= .1 

4.35 

10.48 

0 

= .3 

3.86 

9.94 

e 

.5 

3.76 

9.84 

0 

= .7 

3.71 

9.76 

0 

= .9 

3.61 

9.69 

BAYESS, 0 

= .5 

3,93 

9.91 
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TABLE t>, PERCENT MISCLASSIFIED BY THE USUAL DECISION RULE ON TEST 
FIELDS FOR HAY, GROWING HAY AND PASTURE. THE COUNT IN THE 
FIRST COLUMN INCLUDES EVERY MIS IDENTIFICATION; IN THE 
SECOND COLUMN, WRONG CHOICES AMONG THE THREE MATERIALS ARE 
NOT COUNTED 

INDIVIDUAL 3 MATERIALS GROUPED 


PAST 

25 

0 

0 

PAST 

27 

78 

76 

PAST 

86 

22 

7 

PAST 

88 

52 

33 

HAY 

90 

78 

28 

HAY 

92 

81 

81 

GHAY 

94 

24 

22 

GHAY 

96 

16 

11 

GHAY 

98 

54 

38 

GHAY 

100 

3 

0 

PAST 

139 

20 

0 

PAST 

141 

35 

14 

PAST 

143 

65 

2 

PAST 

145 

99 

0 

PAST 

147 

4 

4 

HAY 

148 

96 

75 

HAY 

150 

99 

12 

GHAY 

152 

.84 

16 

GHAY 

154 

0 

0 

PAST 

201 

99 

96 

PAST 

203 

9 

0 

PAST 

205 

78 

1 

PAST 

207 

26 

23 

HAY 

208 

68 

0 

HAY 

210 

55 

55 

HAY 

214 

96 

77 

GHAY 

215 

99 

1 

GHAY 

217 

3 

0 
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TABLE 7. NUMBER OF DECISIONS tIADE JOINTLY BY QRULE, 

PRI0R9 AND PREF9 ON 42 IMPERIAL VALLEY FIELDS 
USING THE FIRST SET (SIGS X) OF TRAINING FIELDS 


20 Training Fields 22 Test Fields 



Correct 

Incorrect 

Correct 

Incorrect 

QRTJLE alone 

149 

922 

218 

811 

PRIOR9 alone 

26 

209 

34 

97 

PREF9 alone 

834 

491 

872 

737 

QRULE with PRIOR9 

241 

857 

427 

1062 

QRULE with PREF9 

7 

1 

9 

2 

PRI0R9 with PREF9 

587 

257 

545 

364 

all three together 

9278 

1173 

9169 

2072 
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The signatures for pasture, hay, and growing hay extended very poorly from 
training fields to test fields, as is shown in Table 6* To avoid letting 
these huge errors dominate the experiment, and mindful that the three 
materials were quite similar, we did not count an error if one of these 
three materials were recognized as another . If one of these materials 
were recognized as corn, say, then that was counted as an error, because 
we were seeing corn where there wasn't any# And, of course, mistaking 
a corn pixel for one of the three materials was also an error. 

The test on the Corn Blight data (Tables 3 and 4) confirmed the 
previous finding, reported in columns 1 and 2, that the three previously- 
studied nine-point rules have lower error rates than the one-point rule 
and that the per cent improvement is greater for the training than for 
the test sets. These results were obtained on the interiors of fields 
where the assumption underlying the three rules, namely that a pixel 
represents the same material as its ’’mmediate neighbors, would be expected 
to hold. Yet surprisingly, the BAYES9 results for 0 - .3 through 
6 = .9, were just as good as the other results. In the Imperial Valley 
tests the BAYES9 results for 0 « ,3 through 0 = .7 are about the same 
as foi: LIKE9 with m = 8 and 9, and better than all the other results. The 
Corn Blight results were more contradictory, but the BAYES9 rule always 
did well, and if one looks at the results averaged over all four segments 
(Table 5) one finds the BAYES9 results, except for 6 = .1, are best for 
the training fields and a close second to LIKE9 with m - 3 and m = 5 on 
the test fields. A possible explanation for the success of BAYES9 on field 
interiors is that even when one expects uniformity, there may be exceptional 
pixels or noisily recorded data where the probabilitic assumption is more 
applicable than the extreme assumption 

In all six of the runs, the error rate for BAYES9 decreased substantially 
when 6 went from .1 to .3, and although there was no substantial difference 
in the results for 0 = .3 - .9, the rates decreased very gradually as 
0 increased. 
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The Cora Blight resixlts do not present the same consistent picture 
as the Imperial Valley results. The latter results showed a steady improve- 
ment as m increased from 1 to 9,. In the Corn Blight segments 203 and 204, 
the best-m-of-nine rule LIKE9 did best for middle values of m. In the 
test field results of segment 204, the error rate worsened as m went from 
1 to 9. The voting rule V0TE9 was best in segment 203 and did fairly well 
in the other runs but was not exceptional. The trimmed mean rule AVE9 
generally didn't do as well as the others, but did well on the training 
fields of segment 204 and 227. LIKE9 with m = 9 was a little worse than 
m = 8 for all segments except 227. The BAYES9 rule for 9 « ,7 and 6 = .9 
was best on segment 227 and the test fields of 203. 

A look at the error rates averaged over all four Corn Blight segments 
(Table 5) shows that the BAYES9 rule with 9 = ,9 made 2/3 the errors of 
the one-point rule on the training fields and 4/5 on the test fields. For 
the training fields, BAYES9 with 6 = .5 through .9, LIKE9 with m = 7 through 9 
and V0TE9 had the lowest overall rates. For the test fields, BAYES 9 with 
9 = .5 through ,9 and LIKE9 with m = 3 through 7 had the lowest rates, BAYESS 
generally did a little worse than BAYES9 with a comparable setting of 0. 

Of the two voting rules, PREF9 did consistently a little better than 
V0TE9 on field interiors from the Imperial Valley data (Table 2) and is 
comparable in performance to the best of the other nine-point rules. 

The PRI0R9 test results fell about half way between the PREF9 and 
QRULE results. This is not surprising when you consider that the PRI0R9 

g 

decision criterion p e* p. is equivalent to the geometric mean between 

03 • A* 

8 

the PREF9 criterion _ p. and an expression p equivalent to the QRULE 

i"0 

decision criterion (see Appendix C for a definition of these terms). 

A small module was run to count the number of agreements and disagree- 
ments between QRULE, PRI0R9, PREF9 and the ground truth. The results are 
given as Table 7. The table shows that when QRULE and PREF9 disagree, 

PRI0R9 is slightly more likely to side with QRULE than PREF9, and when it 
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does it is far more likely to be wrong. When all three rules agree, they 
are very likely to be right. The mtnbers in the table don^t add up to the 
total number of pixels because when PRI0R9 sides with QRULE, for example, 
then PREF9 is alone and so that category is also incremented. All of the 
categories containing PREF9 add to the total number of pixels and that 
statement holds also for the other two rules. Theoretically, there should 
be no entries in the QRULE plus PREF9 category; a different method of 
handling ties must account for non-zero numbers there. 

PRI0R9 performed as well as the moving average rule AVE9 on the training 
and test sets of the SIGS 2 run, and only slightly worse on the SIGS 1 run. 
The BAYES decision criterion 

P(X. |a) + Se + J P(X |b) “ 

< ^ b 

£ + Ss -f Z P(X. lb) 

b -J 

gives slightly differing results for differing values of e. When e > 0, 
small probabilities P(X^|a) are overshadowed by e and so the square bracket 
factor corresponding to an oddball point tends to make the same contribution 
for all materials. This should be preferable to the tendency of the e = 0 
rule to let every oddball point in the neighborhood exert an erratic 
influence over the decision. . Conversely, too large a value of e might tend 
to impose an unnecessary uniformity on square bracket factors that would 
othenijise help distinguish between materials in a valid way. The test 
results in Table 2 however, show almost no difference in results for the 
the levels 0, .001, .01 and even .1. In both test and training fields, 
the level ,1 is very slightly worse, 

A second test of the new rules was to use them to make maps of a 
stretch of Imperial Valley data. Portions of these maps are given as 
Figures 2 through 6 in the order PRI0R9, PREF9, BAYES9 with 6 = .1, with 
9 “ .3 and with 0 - ,9, (The level in every case was .001.) For comparison 
a QRULE map is given as Figure 1 and BAYESS map as Figure 7. 


P(X^la) 


8 

T 

i=l 
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The purpose of the map making was to compare the performance of the 
rules on the boundary areas. The comparison is necessary because rules 
such as those assuming conformity of the nine pixels might do well on 
field interiors where the assumption holds and do badly on the boundaries 
where the assumption fails. The comparison is qualitative rather than 
quantitative because we can’t be sure of the ground truth of any boundary 

To _ 1 1 

pixel. We do know from aerial photographs ’ P' the pattern of field 

boundaries and observe that they are in many cases traced out or suggested 
in the QRULE (usual one-point rule) map. Rules that clobber the boundaries 
as observed on the QRULE map or omit many sections of boundaries are, we 
suspect, classifying poorly in the boundary areas. Rules that preserve 
fine detail on the boundaries are likely to be classifying more accurately 
there. 

To make the maps comparable, a processing module HISTBN was written 
that prints the cumulative distribution of a selected channel as a table 
of percentages. When we look at such a table of the null test criterion, 
we are able to determine a null test level that will produce a desired 
percentage ofTidte space on the map. From the statistics of previous 
maps of the area, we determined tha 11% white space produced clearly 
delineated boundaries while leaving intact the field interiors. By using 
HISTBN, we were able to print the seven maps with 11% white space. 

The map of EREF9 (Figure 3) although it appears to be the neatest 
because of the smoothing tendency of the PREF9 rule, is the least faithful 
to fine detail on the boundary. Many sections of boundary observable on 
the QRULE map (Figure 1) are missing. The boundaries that do appear are 
distorted by white blobs. 

The PRI0R9 map (Figure 2) is very faithful to the fine detail on 
the boundaries. Hardly any sections of boundary suggested on the QRULE 
map are missing. The boundaries are the same reasonable shape as on the 


F2] 

^ R.F. Nalepka, Investigation of Multispectral Discrimination Techniques 
Technical Report 2264, 12-F, Willow Run Laboratories, Ann Arbor, 
Michigan, 1970, 
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FIGURE 1. QRULE, THE USUAL ONE-POINT RULE 
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FIGURE 2. PRIOR9, A BAYESIAN RULE WHOSE 
PRIOR PROBABILITIES ARE DERIVE! FROM 
NEIGHBORING DATA VALUES 
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ORIGINAU PAGE B 
OF POOR QUALini 

FIGURE 3. PREF9, AN IMPROVED VOTING RULE BASED 
ON POSTERIOR PROBABILITIES SUMMED OVER THE 
NEIGHBORHOOD 
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FIGURE 4. BAYES9, 0= .1, A RULE BASED ON 
PARTIAL DEPENDENCE OF NEIGHBORING PIXELS, 
DEPENDENCE PARAMETER = .1. 
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FIGURE 6. BAYES9, 6 = .9, A RULE BASED ON 
PARTIAL DEPENDENCE OF NEIGHBORING PIXELS, 
DEPENDENCE PARAMETER - .9. 
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FIGURE 7. BAYESS, 0 = .3, A RULE LIKE BAYES9 
BUT WITH THE ODDEST NEIGHBOR OMITTED. 
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QUULE map; they are not distorted. The PRI0R9 map is also a good bit more 
speckled than the PREF9 map, reflecting its poorer performance on field 
interiors , 

The BAYES9 maps (Figures 4-6) show a gradation of performance. As 
0 goes from ,1 to .3 to ,9, small boundaries are lost, the boundaries get 
more distorted and the map less speckled. The map for 0 .3 appears to 

be a good compromise. The distortion is small, the speckling is slight 
and the small boundaries are mostly there. The 0 = .9 map is more 
faithful to the boundaries than the PREF9 map and slightly more speckled. 

The BAYESS map (Figure 7), although an improvement over BAYES9 maps 
made with the old null test, is considerably less faithful to boundary 
detail than the new BAYES9 maps (Figures 4-6) , showing that the new null 
test rather than the BAYESS approach was what was needed to demonstrate 
performance in the boundary areas. 

The maps of LIKE9 and AVE9 given in are similar in 

appearance to the PREF9 and BAYESS maps with respect to their neat 

appearance and their omission and distortion of boundaries, 

r 1 n ^2 1 

The map of V0TE9 > P' •* picked up many ^'nall boundaries missed by 
the other rules and some, even, missed by QRULE. This is because the null 
test criterion (as well as the decision criterion) is the number of first 
place votes for a material. When this fell below a prescribed number, a 
blank was printed. Thus when the 3x3 neighborhood fell on top of a 
boundary between two materials, even if there were no noticeable boundary 
strip such as a road between them, the disputed vote resulted in a null 
decision. However, many extraneous areas not identifiable with one of the 
materials were filled in rather than being left blank as with the other rules. 
Also, many interior points were left blank. Thus the V0TE9 null test was 
more of a boundary detector than a null test and is not comparable to the 
null tests of the other sides. 


W. Richardson, A Study of Some Nine-Element Decision Rules, Technical 
Report 190100-32-T, Environmental Research Institute of Michigan, 

Ann Arbor, Michigan, 1974. 
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5 

CONCLUSIONS AND RECOMMENDATIONS 


5.1 CONCLUSIONS 

The following conclusions are tentative because they are dravm from tests 
on two aircraft data sets. Further testing on satellite data will be required 
to establish their general validity. 

BAYES9 is a nine-point rule that combines accuracy in field interiors 
with sensitivity to detail on the boundaries. Although based on an assumption 
of only partial dependence, it performed as well on field interiors as any of 
the rules based on the presumably valid assumption of complete dependence and 
substantially exceeded the perfo“mance of the one-point rule, QRULE. In 
boundary areas, BAYES9 showed much less distortion and omission of boundary 
detail than the rules based on complete dependence. The increase in fidelity 
to detail as the BAYES9 dependence parameter 9 went from .9 to .1 shows that 
BAYES9 can be adjusted to get the best tradeoff between boundary and field 
center performance. It is therefore a promising rule to use on LANDSAT data 
which has a high proportion of boundary points and for which the best balance 
between field interior and boundary performance has yet to be determined. 

PRI0R9, a Bayesian rule based on prior probabilities derived from 
neighboring data values, is particularly effective in boundary areas and out- 
performs QRULE on field interiors, although the margin of improvement is about 
half that of BAYES9, LIKE9, and PREF9. Because LANDSAT data contains a high 
proportion of boundary points, PRI0R9 merits testing on LANDSAT data. 

PREF9 is a rule based on posterior probabilities summed over the 3x3 
neighborhood. It is therefore based on the assumption of complete dependence 
(i.e., the assumption that ell rfiie pixels represent the same material) and 
exhibits the strengths and weaknesses of such rules: good perforaance on 
field Interiors and insensitive performance on the boundaries. PREF9 resembles 
the voting rule V0TE9 except that it uses all the in£orra>i .ion at each pixel of 
the neighborhood rather than just a vote for the winning -material. It is 
thus theoretically more appealing than V0TE9 and it performed better iu the 
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the preliminary tests* The margin of Improvement was slight, and so the 
conclusion of superior performance is tentative. Because V0TE9 is analogous 
to counting recognitions to estimate acreage and PREF9 is analogous to the 
method of sunnning posterior probabilities, we are led to suspect that the latter 
method of acreage estimation Is more accurate. 

The success of the performance of BAYES9 and PRI0R9 in boundary areas 
speaks well for the null test concept that they both use, namely, that 
materials appearing in the scene but not given a specific distribution can be 
lumped together in a null category which has a flat distribution of height e. 
Theoretically, such a null test should perform best if e is greater than zero 
but not too large. Judging from our preliminary test of BAYES9 at four 
rejection levels, the effect is very slight. Only at a rejection level of 10% 
was there even a slight upturn in rates. 

BAYES8, intended to fix BAYES9 to improve performance on the boundaries, 
is less faithful to fine detail on the boundaries than BAYES9 with the new 
null test, has poorer performance on field interiors, takes longer and does 
not have the sound theoretical justification of BAYES9. We conclude that 
BAYESS is not sufficiently promising for further testing. 

The moving average rule AVE9 compares unfavorably with PRI0R9 - hardly any 
better on field interiors and far worse on the boundaries. The chief value 
of AVE9 and V0TE9 is ease of calculation. These two modules could be run with 

rsi 

the best linear rule and thus save considerable time. If they were run 

with the usual one-point rule QRULE, they would not be much faster than BAYES9 
or PRI0R9 which took 11% and 12% longer, respectively, than QRULE alone in a 
six-channel timing run. 

5 . 2 RECOMMEHDATIOWS 

The development of nine-point classification rules that substantially reduced 
the error rate on field interiors from aircraft data and also appeared to classify 
accurately in boundary areas encourages the hope that these rules will generally 

[3] R.B. Crane and W. Richardson, Performance Evaluation of Multispectral 

Scanner Classification Techniques, Proceedings of the 8th International 
Symposium on Remote Sensing of Environment, Envirotnnental Research 
Institute of Michigan, 1972, 
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outperform the usual one-point rule at the cost of only a slight increase in 
processing time. The applicability of such rules to satellite data processing 
remains to be investigated. We propose the following three-stage test of 
nine-point rules on LAKDSAT data from the LACIE experiment. 

The first stage will be an exploratory comparison of all the nine-point 
rules on two LACIE intensive study sites. Interior points of areas associated 
with ground truth will be identified. Half these fields v^lll be designated as 
training sets and half as test sets. The measure of performance will be the 
percentage of pixels correctly classified computed separately for the training 
and test fields. 

After considering the results from the first stage and from previous 
testing on aircraft data, no more than three nine-point rules will be chosen 
for extensive testing at the second stage. The procedure will be the same as 
for the first stage except that there will be fewer rules and more study sites 
tested. 

The third stage will be to test the second-stage rules as acreage estimators 
in two ways. The first way will be to run the rules as classifiers and 
estimate the wheat acreage, say, by the number of pixels classified as wheat. 

(The number of pixels rather than the number of acres will measure area.) The 
second way will be to calculate the posterior probability of wheat for each 
pixel and sum the posterior probabilities. 

The purpose of the three-stage design is to get the most information for 
the least computer running time. A detailed description of the test plan is 
given as Appendix D. 
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PART II 

BOmSDARY DETECTION 
James M. Gleason 

6 

INTRODUCTION 

The ultimate success of operational remote sensing systems depends upon 
the speed and accuracy with which information can be extracted from the data 
generated by these systems. The farmer, the city planner, and the conservationist, 
among others, desire reliable, up-to*-date information. They will utilize the 
system which provides this information in the most cost-effective manner. A 
number of factors contribute to the appeal of remote sensing systems as a 
feasible source of this information. Large ground areas can be surveyed 
rapidly and frequently from satellites or high altitude aircraft. Miltispectral 
scanners can detect radiation from the ground in each of several wavelength 

bands and store this data in a form acceptable to a computer. High speed 
digital computers or specially designed hardware systems can rapidly process 
the tremendous bulk of data generated by these systems after some necessary, 
preliminary operations have been performed. Further advancements, however, 
must be achieved in reducing the amount of time required for these pre-processing 
operations. Also, more accurate, computerized methods for extracting the 
desired information from the recorded data must be developed. The cost- 
effectiveness of remote sensing systems cannot be firmly established until 
improvements are made in the accuracy and efficiency of these operations. 

One application of remote sensing which has received considerable attention 
is the determination of major crop cratego^es in an agricultural region. A 
multispectral scanner essentially decomposes the scene into a matrix of data 
points, or pixels (j)icture elements), each pixel corresponding to the ground 
resolution element size of the scanner. A multi-dimensional data vector is 
recorded for each pixel with the data value in each channel proportional to the 
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radiance reflected from the ground in a particular spectral band. A maximuia 

likelihood recognition rule is often employed to classify each of these data 

[41 

points as one of the crop categories of interest . This rule is quite 
easily implemented on the computer, with a quantity proportional to the 
likelihood function computed for each crop category. The data point is classified 
as the crop which maximizes the likelihood function. Before each pixel can be 
processed in this manner, however, a mean vector and covariance matrix must be 
estimated for each crop category. This preliminary operation is an error-prone 
and painstaking task which requires locating in the data a reasonable number 
of fields of each category and estimating the necessary parameters from these 
data points. Graymaps of several channels of data, ground truth maps and 
photographs of the area must be carefully examined to determine the location of 
these fields and the precise data points which correspond to each. The actual 
recognition processing is performed on each pixel individually, independent of 
all other points. Only the spectral characteristics of the crop categories 
are utilized in this decision process. 

This investigation has focused upon methods for locating agricultural field 
boundaries as one likely way of improving the procedures just described. Field 
boundaries are an important feature contained in agricultural data sets. The 
development of efficient techniques for extracting this feature will permit 
its utilization as a means of reducing the time required for preliminary processing 
operations and also increasing recognition accuracy. A computerized boundary detection 
algorithm would alleviate much of the manual work required to locate fields so that 
the necessary parameters can be estimated. Classification accuracy could be 
increased by using spatial features as well as spectral features. The location 
of field boundaries would allow entire fields to be classified as a whole, 
reducing the effects of random variations within each field. 

Field boundaries could be employed to increase accuracy and efficiency 
in other manners also. Time-consuming mixtures algorithms are used for estimating 
the proportion of each crop category which contributed to a pixel overlapping 


[4] R.B. Crane, W. Richardson, and W.A. Malila, A Study of Techniques for 
Procesring Multispectral Data, Technical Report 31650-155-T, ERIM, 
Ann Arbor, Michigan 1973. 
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more than one field . If the location of field boundaries were known, these 
algorithms could be applied only in regions where mixtures will occur. Field 
boundaries could also be used for multi-temporal image registration. To make 
use of the time variations which occur among crop categories, data sets 
recorded at different times can be processed simultaneously. Field boundaries 
would provide an excellent feature to align these data sets because the boundaries 
are' quite stationary over reasonable lengths of time. 

The basic gradient and a number of modifications to it have been Investigated 
as a means of boundary point jietection. The detection of boundary points 
(not necessarily closed boundaries) is particularly useful for image registration 
and mixtures application. The basic thresholded gradient has been reported 
as an inherently "noisy” technique. However, the computational efficiency 
of the method is very appealing. Also, the possibility of improved performance 
from modifications to the basic technique, warranted a more thorough examination 
of the method. 

The hypothesis testing technique for closed boundary formation, developed 
r ft 71 

by LARS, Purdue ^ ^ has been tested and evaluated. Closed boundaries are 

necessary for field identification and classification applications. Actually, 
two algorithms have been tested; one employing first-order statistics and the 
other also using second-order statistics. The evaluation of these algorithms 
was undertaken to gain a better insight into the closed boundary formation 
problem, and also to better determine the performance level of these methods. 

Efforts have also been initiated to analyze the boundary detection 
problem in a detailed and sophisticated manner. The problem will be formulated 
in increasingly complex steps and each formulation will be analyzed using 
rigorous mathematical techniques. These formulations will allow the different 
atmospheric and system effects in the data to be isolated and thoroughly under- 
stood. Also, the possibility of employing different boundary features can be 

H.M. Horwitz, P.D. Hyde, W. Richardson, Improveuients in Estimating 
Proportions of Objects from Multispectral Data, Technical Report 190100-25-T 
ERIM, Ann Arbor, Michigan 1974 

[6] R. Kettig & D. Landgrebe, Automatic Boundary Finding and Sample Classification 
of Remotely Sensed Multispectral Data, LARS Information Note 041773, 

Purdue University. 

[7] J.N. Gupta & P.A. Wintz, Closed Boundary Finding Feature Selection and 
Classification Approach to Multi-Image Modeling, LARS Information Note 
062733, Purdue University. 
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investigated. Statistical signal detection and parameter estimation methods 
will be used to analyze each formulation. One basic formulation of the 
problem has been investigated in this maimer. 
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7 

GRADIENT TECHNIQUE AND MODIFICATIONS 

The basic gradient technique and the modifications to it, which have 
been developed and tested, are boundary point detectors. These methods 
will result in certain pixels being designated as boundary points. They 
are not designed to guarantee that the boundaries which are detected will 
also be closed. As mentioned previously, some applications do not require 
closed boundaries. The removal of this closure constraint will, generally, 
simplify the detection problem. The gradient is basically a two-dimensional 
differentiation procedure. It can be used to Indicate both the amount of 
contrast about a given point the direction in which ti.is contrast takes 
place, 

r 81 

The approximation to the spatial gradient used by Anuta^ ^ and also 
discussed by Rosenfeld^^^ and Duda & Hart^^^'^ is illustrated in Figure 8 
for a 3 pixel by 3 pixel scene area, 
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DX = I (C+F+I)-(A+D+G) 1 
DY = I (A+B+C)-(G+H-i*I) I 
MAG = DX +DY 

FIGURE 8 

APPROXIMATION TO SPATIAL GRADIENT 

The DX and DY components indicate the magnitude of the contrast 
occurring over the 3x3 array in two orthogonal directions. They can be 
added as vectors to yield a geometric magnitude and a directional component 
for this contrast. A reasonable approximation to this geometric magnitude 
is formed by the sum of the DX and DY components (MAG) , 
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The gradient qiera tor is employed in a sliding -window mode, such that 
its magnitude (MA.6) is computed at each point in the scene. Boundary points 
are defined as those points at which the magnitude exceeds a specified 
threshold value. 

The following discussion refers to the 3x3 pixel array shown in 
Figure 8. The basic gradient and modifications which have been developed 
and tested are axplained below. 

1. BASIC GRADIENT 

This is the gradient approximation mentioned previously with the 
addition of the geometric magnitude, 

DX = I (C+F+I) - (A+D+G) I 
DY = I (A+B+C) - (6+H-M) | 

MAG = DX = DY 

GMAG = /DX^ -f DY^ 

2 . MEDIAN GRADIENT 

This is very similar to the basic gradient except that the median 
of three data values is used rather than their sum. 

DX = 1 MEDIAN (C,F, I) - MEDIAN (A, D,G) [ 

DY = |mEDIAN(A,B,C) - MEDIAN(G,H,I) I 
MAG = DX + DY 

GMAG = /dX^ + DY^ 

3. ADJACENT GRADIENT 

This technique uses adjacent rows and columns rather than every 
other tow and column. 

DX = I (C+F+I) - (B+E+H) I 
DY = I (A+B+C) - (D+E+P) I 
MAG = DX + DY 

GMAG = /dX^ + DY^ 


4. SECOND GRADIENT 

This processing method uses either the DX and DY components, the 
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MA.G component, or the G^G component and computes a second gradient 

exactly as in methods 1, 2, or 3 above, 

Some additional techniques have also been employed which are not 
strictly gradient methods- These are described below. 

1. AVERAGE 

This algorithm substitutes the average of the data values over the 
3x3 array of points as the value of the center point. The gradient 
functions can then operate on these averaged data values. 

MAG = - (A+B+C+D+E+EH-G+H+I) 

2. WEIGHTED AVERAGE 

Thi.s technique operates on the gradient values rather than the original 
data values. It emphasizes boundary points by weighting the middle row 
and middle column of the 3x3 gradient array. Essentially, it is a 
correlation between a function, which peaks along the middle of the 3x3 
array and slopes down at the sides, and the output of one of the gradient 
operators, (Primes are used in the expressions below to indicate 
that these are gradient values and not data values * ) 

DX-' « .5(C"+F"-M'*)+ 2(B" +E'+H-) + .5 (A"+D’'+G') 

DY" = ,5(A'‘+Br+C'*) + 2(D"+E"+F') + . 5 (G"+H-+I') 

MAG" = DX" + DY" 

3. LINEAR BOUNDARY 

This algorithm is designed to detect boundary points which lie on 
a straight boundary line segment over a 3 x 3 pixel array. The boundary 
segments must lie in either a horizontal, vertical or diagonal orientation 
through the center of the array. A boundary in any one of these 
orientations will have higher .gradient values at the points which lie 
directly on the boundary and smaller values at the points on both sides 
of the boundary. The algorithm is thus designed to detect points 
about which the highest gradient values lie along a straight line 
through the center of a 3 x 3 pixel array. This scheme does not require 
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that a threshold value be specified. It is based on the expected 
spatial characteristics of the boundaries and on the relative 
loagnitudes of the gradients over a 3 x 3 region centered at each 
point in the scene. 

Specifically, the algorithm indicates a boundary point wherever 
gradient values, along three pixels in a straight line through the 
center of the 3x3 array, are all greater than the average gradient 
value taken over all nine points of the array. For example, a vertical 
boundary point would be indicated if the gradients at pixels B, E and 
H v/ere all greater than the average gradient value calculated from 
the points A through I, 

The detection of agricultural field boundaries would be enhanced by the use 
of more than one channel of the multispectral data, It is entirely possible 
that boundaries which are clearly evident in one channel, may not be evident 
in another channel. Unfortunately, the gradient techniques are inherently 
one-channel operations. The basic gradient is an approximation to the 
two-dimensional derivative of a two-dimensional function, a definition 
which cannot be extended to more than one function. 

There are, however, a number of ways in which the gradient techniques 
can be applied to multispectral data. With the exception of the LINEAR 
BOUNDARY method, all of the gradient techniques produce a resultant output 
magnitude value for one channel of data. To handle the multiple channel 
case, these values can simply be summed over all of the channels of interest. 
Alternatively, boundary points could be detected in each channel separately, 
with the final multiple channel output indicating a boundary point whenever 
one has been detected in at least one channel, or possibly two or more 
channels. Another possible method would be to tak^^. the largest gradient 
value over all of the channels, as the final magnitude value* 
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8 

HYPOTHESIS TESTING TECHNIQUE 

Two methods of closed boundary formation developed by LARS, 

Purdue have been programmed, tested and evaluated. The two methods are 
based on statistical hypothesis tests, one using first-order statistics 
only and the other also using second-order statistics. Small, square arrays 
of data points are combined into larger contiguous pixel groups by a 
field building algorithm, beginning at the top of the scene and proceeding 
downward through it. Ideally, the pixel groups will correspond to the 
agricultural fields present in the scene. The outer edge of each group 
forms a closed boundary. These groups are formed such that the data values 
of all of the pixels within each group are statistically similar in 
each channel. This statistical similarity is determined in one case 
by testing the null* hypothesis that the means of two sets of samples 
are equal against the alternative hypothesis that they are not equal. 

This test employs only first-order statistics. An additional test on 
the magnitude of the variance of each sample is also included. In the other 
case, second-order statistics are used to also test the null hypothesis 
that the variance of each set of samples is equal, against the alternative 
hypothesis that they are not equal. Two samples are merged together into 
one larger group, when the null hypotheses described above are not rejected. 
The degree of similarity required not to reject these hypotheses is governed 
by a chosen confidence level for each test, A complete description of the 
field building algorithm and the two statistical similarity tests is given 
in Appendix E, 

The hypothesis testing algorithms which have been programmed for use 
in this investigation were intended to be as nearly equivalent to the LARS 
algorithms as possible. The program now allows for one of five confidence 
levels (90%, 95%, 98%, 99 %, 99.9%) to be chosen for the similarity test. 
Also, only one size is allowed for the sample array, 2 pixels by 2 pixels. 
One slight difference is known to exist in the manner in which the first 
row of samples is processed. However, this change only affects the first 
row and should have no significant affect on the over-^all results of the 
method , 
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9 

TEST AND EVALUATION OF GRADIENT AND HYPOTHESIS TESTING TECHNIQUES 

The gradient * and hypothesis testing techniques have been tested on both 
aircraft and satellite data sets. The primary concern of this investigation 
has been the actual detection of field boundaries and not their use in the 
applications mentioned previously. Unfortunately, this emphasis leaves no 
other method for evaluating the results of these tests than a purely subjective 
one. The ultimate evaluation of any boundary detection technique will depend 
on how accurately and efficiently it can be utilized in a practical application. 
However, it is believed that reliable conclusions can be reached by a thorough 
analysis of the methods and the results which have been obtained. 

9.1 BASIC GRADIENT AND MODIFICATIONS 

The gradient boundary detection techniques were tested using one channel 
of the Imperial Valley data set (3/12/69, Run 5, 5000 ft., .478-. 508 Urn). This 
data set was chosen because of its usefulness in previous experiments for 
separating good processing techniques from poor ones. For simplicity of 
analysis only the one channel case was considered. All of the gradient techniques, 
with the exception of the LINEAR BOUNDARY method, require that a threshold value 
be set which the gradient magnitude must exceed to be designated as a boundary 
point. However, the use cf one threshold value is not sufficient to obtain 
meaningful comparisons between these different techniques. The range of output 
values from these methods will vary considerably because of the particular 
computations involved in each. A threshold value which produces reasonable 
results for one technique may produce very poor results for another. These poor 
results for the latter method are caused by an improper threshold setting anH 
are not truly indicative of the capabilities of that technique. To avoid this 
difficulty, a different threshold setting was used for each method to be compared. 
The threshold value was chosen to produce a fixed percentage of the total number 
of data points as boundary points. The value for each particular technique was 
determined by generating a histogram of the output values and obtaining the 
proper threshold setting from that histogram. 
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The following qualitative results have been observed from the gradient 
techniques which have been investigated: 

1. The gradient techniques are, indeed, "noisy" as indicated in the 
literature and as expected* As the threshold is lowered, more true boundary 
points are detected but more false boundary points are also detected. Many 
false detections are made throughout the scene when the threshold is lowered 
to 'a level at which the low contrast boundary points are correctly detected. 

At this same level, the high contrast boundaries are significantly widened. 

Many errors are also made near wide roads and other such inhomogeneous areas. 

2. The MEDIAN GRADIENT is insensitive to noise spikes. The other 
techniques will most often incorrectly detect these spikes as boundary points. 

3* The MEDIAN GRADIENT values have a smaller range thau the BASIC and 
ADJACENT GRADIENT techniques because the DX and DY components are each 
differences between two data values. In the other methods the differences 
are taken between two sums of three data values. 

The ADJACENT GRADIENT has a smaller range of values than the BASIC 
GRADIENT. This can be attributed to the gradual change in data values across 
the 3x3 array. The change across adjacent rows is not as great as between 
two rows separated by one row. 

The increased range of the BASIC GRADIENT makes it slightly less sensitive 
to noise (excluding noise spikes) than the other two methods. In the MEDIAN and 
ADJACENT methods all of the data is more lumped together and ir is more 
difficult to discriminate the boundary points. 

4. There is no apparent increase in accuracy using the geometrical gradient 
magnitude (GMAG) as opposed to the algebraic magnitude (MAG). However, GMAG 
requires more computation time than MAG. 

5. After averaging (AVERAGE) the entire data set, all of the gradient 
methods widen the true boundaries but do not detect as many false boundary points. 
The averaging effectively blurs the scene. Noise variations are not as great but 
variations due to actual boundaries are also less pronounced and more spread out. 
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6. The WEIGHTED AVERAGE tends by its formulation to emphasize the 
horizontal and vertical boundaries. It widens the actual boundaries and also 
forces them to line up in vertical and horizontal orientations. The averaging 
feature of this method also decreases the noise. 

7. The second gradient methods do not yield a significant increase in 
accuracy over the first gradient methods. The SECOHD GRADIENT does eliminate 
sc of the interior points in regions which were entirely covered with first 
gradient boundary points. This is due to the second differential character of 
this method and its insensitivity to uniform changes. This method also 
widens the boundaries. 

8. The advantage of the LINEAR BOUNDARY method is that an arbitrary 
setting of a threshold value is not necessary. Two problems arise* however. 

First, the algorithm tends to be sensitive to the row nature of the agricultural 
fields. Second, some boundary points are not detected because, in fact, the 
boundary line segments which they lie on are not straight over a 3 pixel by 

3 pixel scene area. 

9.2 HYPOTHESIS TESTING TECHNIQUES 

The two hypothesis testing methods of closed boundary formation have been 
tested and evaluated using one aircraft data set and one satellite data set. 

The aircraft data set was taken from Mission 43M of the Corn Blight Watch 
Experiment, segment 204. Four channels of data were utilized (.46— .49 UiHj 
.50-. 54 lira, .54-. 60 ym, .61-. 70 ]Jm) . This data set was chosen because of the 
acceptable classification accuracy obtained from it during the Corn Blight 
experiment. The satellite data set was taken from the "San Francisco" ERTS 
frame (1003-18175). This data set was chosen because of the large agricultural 
fields present in the scene. 

For the data sets which were tested, the method using only first-order 
statistics produced superior results to that which also used second-order 
statistics. Also, the results for the aircraft data set were significantly better 
than those for the .octcilite data set. The data was analyzed in groups of 2 x 2 
arrays of pixels, and although the satellite data set was chosen because of 
its large fields, the number of pixels in each field was still quite small. 

Many of the 2x2 cells overlapped more than one field, resulting in a significant 
loss of detail. 
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An important parameter of these methods is the confidence level which must 
be chosen. As this level is lowered, the number of errors of commission 
(detecting boundaries which do not exist) shou..d increase. At the same time, 
the number of errors of omission (not detecting boundaries which do exist) should 
decrease. For the data sets used in this investigation, decreasing from the 
highest confidence level to the lowest, resulted in a minimal decrease in the 
number of errors of omission, as compared to the increase in the number of errors 
of commission. The errors of omission were negligible even at the highest 
confidence level (99.9%). The errors of commission, however, were unacceptably 
high at the lower confidence levels (90%, 95%, 98%) and only became reasonable 
at the higher confidence levels (99%, 99.9%). The variations between the fields 
in these data sets were of sufficient magnitude that the true boundaries could 
be detected even at the highest confidence level. The variations within the 
fields, however, were also of sufficient magnitude that many false boundaries 
were detected at the lower confidence levels. Only at the higher confidence 
levels, could these within-field variations be rejected, with reasonable 
accuracy, as possible boundaries. 

It appears that reasonably accurate results can be obtained with these 
methods if the fields in the scene contain a fairly large number of pixels and 
the Xi7ithin-f ield variations are significantly less than the between- field 
variations. If these conditions are satisfied, the confidence level can be 
adjusted to reject the within-field variations as possible boundaries, but still 
detect the between-f ield variations. The basic concept of the field building 
algorithm is a very good procedure for forming closed boundaries. The generally, 
small number of pixels in each field in satellite data sets and the unknown 
nature of the variations of the data limit the application of these methods. 
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DETAILED ANALYSIS OF BOUNDARY DETECTION PROBLEM 

The boundary detection problem is currently being approached in a thorough and 
sophisticated manner. The investigations conducted into the use of the gradient and 
hypothesis testing techniques have indicated that such an approach is required to 
obtain an acceptable solution. This problem is made difficult by the ccmpl«^ 
factors which are involved in the acquisition of the data and also by the inherently 
variable nature of the boundaries. The most significant factors which affect the 
data are the atmosphere and the characteristics of the sensing system. The 
atmosphere absorbs radiation as it passes from the ground to the sensor and scatters 
radiation into the field of view of the sensor. The resolution of the scanning 
system and the sampling scheme which is employed are but two of the system factors 
which also affect the data. These atmospheric and system effects result in the 
boundaries being encoded in the data in a very complex manner. The variability ’ 
of the boundaries is due to the many possible combinations of ground classes 
which they may separate, and the generally unknown spectral and spatial characteristics 
of these classes. This variability results in a lack of prominent features which 
can be used to discrimininate between the boundaries and the remainder of the 
scene. 

The problem is now being approached by formulating it in increasingly complex 
steps and using rigorous mathematical methods for analyzing each formulation. 

The problem is first formulated in its most basic aspects. Additional factors which 
affect the solution will be considered in later formulations. This sequential approach 
will allow each factor to be isolated and clearly understood. Each factor 
can first be considered Independently and later in combination with others. In 
this way, the complex atmospheric and sensor system effects which encode the 
boundaries in the data can he analyzed and methods of decoding this data developed. 

The mathematical techniques which are being employed are those of statistical 
signal detection and parameter estimation. These techniques have proven 


63 


V " " t 

2 ™ 


FORMERLY WJLLOW RUN LABORATORIES. THE UNIVERSITY OF MICHIGAN 


useful in the extraction of information from radar and communications signals. ' 

The variability of the boundaries will be taken into account by formulating 
the problem in terms of signals of known form and signals with unknown 
parameters. Formulations with small amounts of variability are considered 
first and more complex cases will be considered later . The solution of these 
formulations of the problem will require that these various signals be detected 
in the data and then unknown parameters estimated. By analyzing each formulation 
with these rigorous statistical tools, optimal or near optimal methods of 
performing these operations may be developed and their performance 
predicted. 

Only one basic formulation of the problem has been investigated thus 
far. This formulation assumes that continuous data is obtained over 
a spatial interval in which a boundary is known to exist. The mean values of 
the data from the two ground class separated by the boundary are assumed to be 
known. The data is contaminated by zero-mean, additive, white Gaussian noise. 

This noise reflects the variations which occur in the data from random atmospheric 
and system effects and also from variability within the ground classes. The data 
must be processed to obtain an estimate of the unknown boundary location. 

The data, or received signal r(y), is recorded over the spatial interval 
[-Y, Y], This received signal consists of a signal of known form s(y), shifted 
by the boundary location parameter y, plus the noise signal n(y). 

(_y < y < y) 

r(y) = s(y-y)+ n(y) _ - 

(-Y < y < Y) 

The known signal s(y) extends over the infinite interval and has a form similar 

to that shown in Figure 9. The mean values of the t^ro ground classes contained 
within the interval are and The exact shape of the transition region 

between these two values is primarily determined by the aperture function of 
the sensing system. Specifying a value for the parameter y shifts this known 
signal form to the right or left of the origin by an amount which depends on 
the magnitude and sign of the value. The true boundary location is y . The 
estimate of the boundary location y, should be as close to this true value as 
possible. 
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Figure 9 

knot™ signal form s(y) 

Our estimation procedure has been to find the maximum likelihood 
estimate of the boundary location. Such estimates have been shown to be 
useful in many diverse applications and have desirable statistical properties. 
Estimation procedures which employ the prior probability of occurrence of the 
unknown parameter could also have been investigated. Frequently, however, the 
distribution of the parameter is rather flat, particularly in the region near 
the true value, and the resulting estimation procedure is equivalent to maximum 

A 

likelihood. The maximum likelihood estimate of y is that value Y which 
maximizes the likelihood function p(rCy)jy). 

The log-likelihood function A(y), for continuous r(y) on the interval 
[-T, Y], is given by 

Y 

A(Y) = An K - f [r(y)-s(y-y)]^dy 

° -y 

where K is a constant and is the noise spectral density. The maximum likelihood 
estimate is that value y which maximizes this expression when substituted for the 
parameter y. 

A necessary condition which the maximum likelihood estimate must satisfy, is 
that the partial derivative of the log-likelihood function must equal zero when 
evaluated at that value. Differentiating the expression for A(y) and evaluating it 
at y = y, yields the result that 
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Y 

(* A A 

J [r(y)-s(y-Y)l s"(y-Y)dy = 0 

_Y 

must be satisfied for a maximum likelihood estimate. 

The second part of this integral 
' Y 

s(y~Y) s"(y-Y)dy 
-Y 

can be easily evaluated and is equal to 

|[s^(Y-Y)-s^ef-Y)] 

A 

Thus, the estimate Y tnust satisfy the relationship 

Y 

r(y) s'iy-y)dy [s^(Y-y)-s^(~Y-y)] 

-Y 

The right-hand side of this expression is essentially a bias term resulting 
from the non-zero values of the shifted signal at the endpoints of the observation 
interval. The left-hand side is obviously a correlation function, but the value 
of this expression can also be realized as a convolution operation. The output 
v(y) of a linear filter with impulse response h(y) and input r(y) defined on the 
interval [-Y, Y], is given by the convolution of h(y) end r(y). 

Y 

■v(y) = I r{u) h(y-u) du _Y ^ y ^ Y 

-Y 

Applying the received signal to a filter with impulse response 

h(y) = s'(-y) 

results in an output as a function of y which will be identical to the value of 
the left-hand side of the above expression as a function of the parameter y* 

A 

The estimate Y is equal to the value of y at which the filter output is equal to 
I [s^(Y-y) - s^(-Y-y)]. 
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The distribution of the boundary location estimate obtained by this procedure 

can be derived in the high signal-to -noise ratio case. The derivation is shown 

in Appendix F. The maximum likelihood estimate Y ^ Gaussian distribution 

with mean value y and variance 

° N 

A O 

V(Y) = X 

2j [s'(y)I^dy 
-Y 

The Cramer-Rao bound is also derived in Appendix F, This bound specifies 
the minimum variance which can be achieved by any estimate of the boundary 
location. In the high signal- to-noise ratio case, the variance of the maximum 
likelihood estimate is identical to the minimum variance specified by the Cramer- 
Rao bound. This estimation procedure yields not only an unbiased estimate but 
also a minimum variance estimate. 

Consider, for example, a signal s(y) obtained as shown in Figure 10, The 


s„(y) 




s(y) 


ra, 


2 

-4 


\r 
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, Figure 10 

GENERATION OF SIGNAL s(y) 

input signal s^(y) contains a perfect step function as its transition region. 
The signal is passed through the ideal low pass filter with transfer function 
This filter bandlimits the output signal s(y) to frequencies less than The 
high frequency components resulting from the step function are removed by the 
filter, and the output signal s(y) is similar to that shown at the beginning of 
this section. 
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The variance of the maxim um likelihood estimate of the boundary location 
for this signal s(y) will now be determined. The derivative of s(y) will have a 
value other than zero only in the immediate vicinity of y = 0 . Letting 

g(y) = s'(y) 

and using Parseval's theorem. 


Y 

I [s"(y)l"dy 

-Y 


Y 

[g(y)]^dy 

-Y 


2 tt 


CO 

|G(w) I ^d 03 

•.00 


where G(ti)) is the Fourier transform of g(y). The signal s^(y) has a derivative 
s^^Cy) and Fourier transform S(o)) given by 

s^"Cy) = (M2 - M^) 6 (y) 

S(oj) = M2~M^ -00 < (0 <00 


The derivative of the output signal s(y) is equivalent to the output of the filter 
if the input were s^'*(y). The Fourier transform of the derivative of the signal 
s(y) is, thus, 

G( 6 J) = - Si <w< 


A 

The variance of the estimate y is given by 


V(Y) - 


N 
■ o 

Y 
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[s'(y)]^dy 


-y 
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f |G(W)| 2 d 
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Substituting for the magnitude of G(t^) the variance of the maximum likelihood estimate 

^ N ir 

V(Y) = 

2S2(M2-M^)^ 

The variance is proportional to the magnitude of the noise spectrum and inversely 
proportional to the bandwidth of the signal and the squared difference of the two 
mean values. 
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The analysis of this simplified formulation of the boundary detection 
problem is but a first step towards an acceptable solution. The results 
which have been presented, however, indicate the power of the method. The 
convolution procedure which has been developed for obtaining the maximum 
likelihood estimate is a useful operation in many applications. It can be 
impleraeuted by approximating the required impulse response and utilizing 
either an electrical circuit or digital computer to perform the necessary 
filtering. The variance of the estimate obtained by this procedure is the 
minimum variance which can be obtained by any estimation procedure in the 
high signal- to-noise ratio case. In this sense, the estimate is optimal. 

The variance of the estimate for ^ne bandlimited signal indicates the accuracy 
which can be expected in this case and the factors which affect this accuracy. 
If the values of these factors are known in advance, the performance of the 
technique can be predicted without having to rely on actual test results. 

Also, the factors which affect the estimate and their relative significance 
can be clearly determined and understood. The advantages of this approach 
will be more clearly evident as more complex formulations of the probiem 
are considered. 
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CONCLUSIONS AND RECOMMENDATIONS 

Investigations of the use of the gradient and hypothesis testing 
techniques have indicated the need for a thorough and sophisticated 
analysis of the boundary detection problem. The analysis is required 
because of the complex manner in which the boundaries are encoded in the 
data, and also because of the lack of prominent features by which the 
boundaries can be discriminated in the scene. The approach which has 
been adopted is to formulate the problem in Increasingly complex steps 
and analyze each formulation using rigorous mathematical methods. In 
this manner, each factor which affects the encoding process can be 
systematically included in the formulation and clearly understood. 

Also, the use of various boundary features can be Investigated by formulating 
the problem in terms of known signal forms to be detected and unlcnown 
parameter values to be estimated. Statistical signal detection and 
parameter estimation methods can then be employed to investigate the 
solution of each formulation. These methods will often result in the 
development of optimal or near-optimal solution techniques. 

The basic formulation of the problem which has been analyzed indicates 
the power of the method. The solution technique which has been developed 
is easily implemented and yields a boundary location estimate which is 
unbiased and also optimal in the sense of minimum variance. The factors 
which af; ict the accuracy of this estimate for a certain case and their 
relative significance are also clearly indicated. 

Additional formulations of the problem should be analyzed in a similar 
manner. These formulations should include the effects of the atmosphere, 
the resolution of the system and the sampling scheme. These formulations 
should also include assumptions which are less restrictive than those 
used in this simple formulation. 

The results of the basic gradient technique and the modifications which 
were developed, were for the most part unsatisfactory. Test results 
demonstrated that, for a reasonable threshold setting, many true boundary 
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points were detected but too many false boundary points were also detected. 

In regions where large contrast boundaries existed, the gradient techniques 
significantly widened the boundaries. In regions of low contrast boundaries, 
many false boundary points were detected throughout the scene before the 
threshold was sufficiently lowered to the level that the low contrast 
boundary points were detected. The gradient methods are effective as an 
easily Implemented means of boundary enhancement for visual examination. 
However, difficulties in choosing a proper threshold and a tendency to 
emphasize randomly occurring variations, must be overcome before these 
techniques can be effectively employed. 

The results of the hypothesis testing methods are more difficult 
to categorize. The method using first-order statistics performed better 
than that which also used second-order statistics , The results achieved 
with the aircraft data set at the higher confidence levels were reasonable. 
The results for the satellite data set were less sf '-isfactory . 

The most significant aspect of these methods is the field building 
algorithm which guarantees closed boundaries. The results which can be 
achieved by these methods depend primarily on the within-field and 
between-f ield variations which occur in the data set. The confidence 
level determines the amount of within-field variation allowed. If 
the within-field variation is less than the between-f ield variation 
throughout the scene, good results can be achieved with a proper 
confidence level. As the within-field variations approach the between- 
field variations, however, the results will not be as satisfactory. Also, 
the small number of pixels in the fields contained within satellite data 
sets significantly limits the use of these techniques for satellite 
remote sensing operations. 
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APPENDIX A 
DERIVATION OF BAYES 9 


The basic assumption of BAYES9 is that a pixel probably represents 
the same material as neighboring pixels. The following modpl s:ipresses this 
assumption as a function of 0» a parameter between 0 and 1 measuring the degree 
of dependence between a pixel and its neighbor.* 

Let I and J be neighboring pixels, let a and b be materials and let P 
stand for "probability that". Assume that there are prior probabilities P(I*a) 
for all materials a. Under conditions of complete Independence 

P(I=a|j=a) = P(I^a) 

P(I=a|j=b) ^ P(I=a) 

and under conditions of complete dependence 

P(I=a|d=a) = 1 
P(I=aij=b) = 0 

When the truth lies somewhere between dependence and independence, the 
model defines these probabilities to be a weighted average of the extreme 
values : 

P(I=a!j=a) = (l-e)P(I=a) +6-1 
P(I=alj=fa) = (l-0)P(I=a) +6*0 
which, rewritten neatly, is 

P(I==a!j=a) = (l-6)P(I=a) + 6 
P(I=a|j=b) = (l-0)P(I=a) 

When 0=0, the neighboring pixels are completely independent. When 

6=1, they are completely dependent. Thus 0 can be thought of as a "coefficient 

of dependence" ranging between 0 and 1. 

We will show that this definition satisfies the postulates of probability. 

Obviously PCl=ajj=a) and P(I=aij=b) are between 0 and 1. It remains to show that 

*The following ERIM personnel assisted in this derivation; J. Gleason originated 
the conditional probability definition of the model, R. Crane suggested an 
approach to the derivation, and R. Kauth corrected a significant error. 
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I P(J=b and I=a) *=^ P(I=a) 
b 

which is equivalent to showing that 


i . e . , that 


P(J=b and I°»a) 
P(I=a) 


I P(J=bjl=a) = 1 
b 


by the definition of conditional probability. Now 




I P(J=bll=a) = P(J=a|l=a) + I P(J=btl=a) 
b bfa 

= Cl-6)P(J=a) + 6 + (1-0) I P(J^b) 

b?^a 

- (1-0) I P(J=b) + 6 
b 

= 1 - 6 + 8 = 1 

, y P(J==b) = 1. Q.E.D. 

because z 
D 

We will establish the relationship between 9, the number k of materials and 
the probability p that two neighboring pixels are the same iM,terial. This 
relationship holds when all the prior probabilities P(I=a) are equal, and since 
there are k of them, 

P(l=a) ^ 1/k 

for every material a. p, the probability that a pixel is the same material 
as its neighbor, is more precisely 

P(I=a|j=a) - (l-e)P(I=a) + 8 
so 

p - (1-0) 1/k + 0 
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Alternatively , 

0 


kp-1 

k^l 


Let X and y be data values from I and J 
probablli**v that I=a given data X from I and 


respectively, 
data y from J; 


We will derive the 


P(I=a| X,y) 


P(X and y and I=a) 
P( X and y) 


by the definition of conditional probability. Let g be P(x and y) . The 
probability we are seeking can be expanded as 

^ P(X and y and I=a and J=b)/g 
b 

= ^ P(X and y I I=a and J»b)P(I-a and J-b)/g 

b 

by the definition of conditional probability. Now we assume that once a 
and b are given, X and y are distributed independently, so the expression 
becomes 

I P(x|l=a)P(y|j=b)P(I=a and J=b)/g 
b 

where P(X|l=a) is the density function of X given I=a. 

The expression becomes 

J=a)+ y P(y|j=b)P(I=a and J=b)] 

^ b?a 

By the 9 model definitions 

P(I=a and J=b) = P(I=a i J=b)P(J^b) 

= P(J=a) [ (l-*6)P(I=a)+9] when b - a 

= P(J=b)[(l-0)P(l=a)] when b a 
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We are assuming P<I=a)=P(J=a) . So the expression is 


[P(ylj=a)P(I=a)((l-0)P(J=aH 6) 

€ 

+ I P(J=h)(l-e)PCl=a)P(y|j=b) 

b?^a 

P(I=^a)PU | I=a) [ 9 p(y|j=a) + (i_ 9 ) p(j=b)P(y | J=b) ] . 

® b 


This formula expresses the a posteriori probability that I=a given the 
data X and y from the neighboring pixels I and J. We will now extend this 
formula to a center pixel Iq and eight neighboring pixels with data 

values Sq and X.j^.*.X:g, respectively. The derivation is analogous to the two- 
pixel case. 

The decision rule will be to choose that material a for which the 
a posteriori probability of a, given the nine data values is 

greatest. This probability is 

P(Iq— a [ Xq» • . . ,X g) — P(Xg,...,Xg and lQ=a)/g 


where g stands for P(Xg, . , . ,Xg) , by the definition of conditional probability. 
This expression can be expanded to a sum of joint probabilities: 

• • • J and Ig=aj • ,Ig=bg)/g 

’>1 ’’s 


= I ^ P{Xo.....Xg|I(,=a,...,Ij-bg)P(I(j-a 

'’l '"s 

We could, at this point, get bogged down with a complicated model for P(Ig=a,..,, 
Ig-bg) , but a practical course to take is assume that the neighbors are 
independent of each other and then 
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P(I^ = Ig = bg|lQ=a)=P(I^=b^|lQ=a)...P(Ig=bg[lQ=a) (13) 

The rightmost factor of expression (12) can be put in this form by applying 
the definition of conditional probability: 

F(Io=a Ig=bg) = P(Ij^=b^,...,Ig=bg|lQ=a) Pd^^a) 

=P(Il=bi|l„=a) ... P(Ig=bg|l(,=a) P(Ig=a) 

by the assumption (13) just made. We continue to assume that the data X-~»... 5 Xo 

u o 

are distributed independently once the materials b„,,,.,b„ are known, i.e., that 

U o 

F(Xq, * . . ,Xg I lQ~a, . . . ,Ig— bg) = P (Xq I lQ“a) . . .P (Xg j Ig = bg) 

The desired a posteriori probability is 


I I P(XQ|lQ=a) . . .P(Xg|lg=bg)P(I^=b^|lg=a) , . .P(Ig-bgllg=a)P(lQ=a)/g 


^1 ^8 


P(X =a)P(I =a) 

[I P(X^[l^j=b^)P(I^=b3^1lg=a)].,.[2 P(Xgllg-bg)P(Ig-bg|lg=a)l 

^1 ^8 


g 


which, written more compactly, becomes 

8 


P(X |I =a)P(I =a) 

— 1=1 'I rajij=b)pa.=b|ig-a)] 
b 


S 


Now 


P(I^=bilQ=a) = (l-0)P(I^=b) for b 5^ a 


=(l-0)P(I^=b)+ 0 for b=a 


We are assuming P(I^=b) is the same for all and can be written P(b). Thus the 
a posteriori probability is 
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P(X^|l =a)P(a) 8 , . 

— —i n [e PCxJi =a)+(i-e) I p(b) pcxJi =b)i 

g i=l b 

The denominator g Is the same for all materials. The decision is unaffected 
by its absence. When all the prior probabilities P(a) are equal to 1/k, 
then P(a) can be left out for the same reason. Also each square bracket term 
can be divided by 6. The resulting criterion is the one given in section 3.1. 


P(S„|lg=a) ^n^[P(Xjl^=a)+i= 


1-0 b 


^P(X,|I =b) 


] 


( 1 ) 
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APPENDIX B 

IMPLEMENTATION OF BAYES9 

A multispectral software system at ERIM consists of subroutines PROCESS 
and POINT that mount, read and unpack data tapes, call modules to process the 
data, pack output data values, four to a word, and write an output tape. 

At the point -processing stage, each module accepts an input data point called 
DATUM consisting of NCHAN channel values, modifies the DATUM vector In some 
way, storing the output vector in DATUM. After all the prescribed modules 
have been called, POINT and PROCESS pack up DATUM, adding it on to the output 
line that will be written on tape. If several operations are to be performed, 
they can be done as separate jobs (with the intermediate tape for one job 
providing the input for the next) or they all can be run together (with each 
module picking up the output DATUM vector from the previous module) . The modules 
are also called at an earlier stage when initial calculations are made, and at a 
later stage for final calculations and printing of results. 

BAYES9 is a POINT module to carry out the nine-point Bayesian decision 
rule. Equal prior probabilities are assumed. This assumption is seldom 
departed from in multispectral processing, and the simpler formulation of BAYES9 
that it permits runs faster than the general rule, uses less storage and avoids 
scaling problems. A later version of the module will allow the user to select 
either of the two rules. 

The criterion used will be criterion (3) in section 3-. 1 
g + s + s JPCX.^ib) 

POL Ja) II [ ; ] (3) 

i=l e + se + s ^ PCX. |b) 

b ^ 

where s = (l-0)/6(ld-l) and the summation is over the non-null densities. A 
preceding module, QRULE, calculates for each point X and each material a the 
exponent of the likelihood P(X^|a) and stores it as an integer C, with values 
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from 0 to 511, in DATUM. The density 

^ ^ 1 
‘-2 - 2 C 

P(X^|a) = (27 t) e 

is not computed by QRULE or used in the one-point decision rule because the 
exponent is sufficient for the decision. BAYES9 does need the true density, 
but. the constant factor does not affect the decision so it is left out. 

' 2 ^ 

e is precalculated for all 512 possible values of C and stored in a 

floating point vector EXTAB. Thus the exponentiation, which would 

ordinarily be time-consuming, is carried out in a flash by referring to EXTAB(C). 

In the nine-point processing modules, a subroutine SAVE9 saves the two 
most recent lines of unpacked data vectors. Before the saving takes place, the 
square bracket factors of expression (S) are calculated and stored in the saved 
line where DATUM would have ordinarily been stored. This allows each square 
bracket factor to be used eight times without being recomputed. 

The module is most easily explained if I first describe the earlier, 
slower method of calculation and then show how it was modified. In this 

earlier method, the square bracket factors are multiplied together and then 
multiplied by P(X^|a) to get the decision criterion in floating point form. 

The material with the biggest criterion, corresponding to the largest a posteriori 
probability, is chosen. The number of the material chosen is put out as 
DATUM ( 1) . The value of the criterion, appropriately scaled, is put out as 
DATUM (2). 

This number is , for its crucial values , very close to zero and so is hard 
to scale. The transformation 

-2 logg (criterion) 
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makes it like a chi-square value, easily scaled. Let d stand for the criterion. 

Because log^(d) is a time-consuming library subroutine, the following 
speeded up procedure is implemented. 

log^ d = log^ d * logg 2 

Log 2 = .69. Hence 
e 

-2 log^ d = - log2 d * 1.3862944 

The multiplication by 1.39 is not essential. The purpose of the trans- 
formation was to scale the criterion and this purpose is as well accomplished 
if the 1.39 is left out. Therefore - log 2 d is obtained from the floating point 
representation of d in the computer by the following rapid procedure. 

The floating point representation of d in the IBM 7094 consists of a sign bit, 
an 8-bit exponent E and a 27-bit fraction M. E-200g is the power of 2 which 
multiplies the fraction to get d: 

(E-200„) 

d = M * 2 

-log^ d = -log2M - (E-200g> 

= -log2M + (200g-E) 

We want -log^d to be an integer, but -log^M is a fraction filling in the gaps 
between the integer values of (200g-E) . Therefore we shift each term of -log 2 d 
left nine places (i.e., multiply by 512). The second term could be calculated 
by subtracting E from 200g and shifting left 9 bits. But it is even faster to 
precompute this result as a 512— valued table L0G2E and obtain it by a reference to 
the table. The first term, shifted, is also obtained from a 512-valued precomputed 
table L0G2M whose domain is the first nine bits of M. So to get -log^d we just 
compute 

LOG2M(M) 4- L0G2E(E) 

and shift right nine bits. Then 200 is added to make sure the criterion falls 
in the range 0 . . . 511 . 
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This algorithm for computing the criterion requires eight floating point 

multiplies in the inner loop* which is repeated for each material at each pixel. 

A modified algorithm comes out with eight integer adds in the inner loop, a faster 

operation, and is therefore used in the current version of BAYES9. 

For this algorithm, each square bracket quantity is subjected to the -log 2 

transformation before storing the values in the SAVE9 array. The final right 

shift is not performed at this time. 1-Jhen ttie product in criterion (3) is 

actually calculated, it is the sum of eight integers. The leftmost factor P(X^la) 

is included in the calculation by one more integer add. The original form of 

P(X la) is “21og P(X la) expressed as an integer between 0 and 511. A table 
o' e o 

LOGTAB(I) has precomputed I shifted left nine places ai.d divided by 1.3862944 to 

get it in the form -log^ P(X^|a) * 512 like the square Iracket factors. 

The number of log 2 operations performed is one per pixel, which is the 

same as before. The line of center values is saved in a special vector because 

only one line needs to be saved and it would be a waste of space to save two lines 

of this data in SAVE9- The criterion is shifted right nine places before being 

put out as DATUM(2), so that it will fall in the range 0...511. 

The computation of the criterion requires many operations, none of them 

lengthy by themselves. In this situation, much time is saved by programming 

the point-processing routine in assembly language. Such a version was written 

and found to give results identical to those of the MAD* version. A timing 'run 

was made using six out of ten channels, 200 points per line and 60 lines with 

the following results; 

QRULE alone 203.9" 

QRULE + BAYES 9 (MAD) 265.4" 

QRULE + BAYES9 (assembly) 225.4" 

Subtracting the QRULE alone time from the other two, we find that the 

additional time taken by the BAYES9 module was, in each case 

BAYES9 (MAD) 61.5" 

BAYES9 (assembler) 21.5" 

*a source language similar to ALGOL 
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so the writing of the module in assembly language cut the running time for the 
BAYES9 module by a factor of 3. The assembly-language version of QRULE + BAYES9 
took only 11% longer than the one-point rule QRULE alone. 

The left side of the null test has been scaled by the "log 2 transformation 
and so the right side becomes “log 2 E 2 200. To estimate -log 2 £ 2 » choose 
a value EXPLIM of chi-square representing a reasonable dividing line between 
points inside and outside the distribution, estimate a typical value |r| of the 
determinant of a covariance matrix and compute 

EXPLIM + log^ {Rj 
-log^ ^2 ^ 1.38 ^ 

A first guess for EXPLIM is the value in the table of the chi-square distribution 
with the row corresponding to the number of original data channels and the 
column corresponding to the .001 significance level. 

We will now derive bounds for criterion(3) . Each square bracket term 
can be written 

P(X |a) + se + .i I P(x. Jb) 

^ b ^ 

£ + Se + s I P(x. |b) 
b ^ 

where the sura is over the non-null materials. We abbreviate expression (1^) 
as 

P Se -i- se 

E + Se + S2 (15) 


for ease of discussion. All the terms are > 0 and P < S. (15) can get small 

only when P < e. The lower bound occurs when P -»■ 0. Then (15) 1 b 

Se + SZ 
£ + Se + S% 

The bigger Z is, the closer (16) is to 1. Hence the lower bound occurs when 
I -*■ 0 and is therefore 


(16) 
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se _ s 

e + se ” 1+S 

Expression (15) increases from 1 as P increases from e. Hence 


<• P + S£ + S£ ^ P + SE -t- SS ? -h SZ 
^ e + SE + S2 SE + S2 SS 

because the effect of adding SE, top and bottom, is to bring the fraction 
closer to A. 


p + sj: ^ + SS _ 1 + S 

s£ S£ S 


because P Thus v;e have shown 


S 

1 + S 


< square bracket factor of (3) < 


1 + S 
S 


Now S 


1-e 


0(ithi) 


1 + s = 


6,k + 9 + 1-0 

e(k-i-i) 


9k + 1 
0(khl) 


S _ 1-6 

1 + S “ 0k + 1 


So 


1-9 

0k+l 


< square bracket factor of (3) < 


6k + 1 

1-0 


Hence 


P(X^ja) (■ 


1-6 

ek+1 


8 

) 


< expression (3) < P(X^|a)( 


ek-n 

1-0 


8 

) 


The bounds show that no square bracket factor in criterion (3) is an 
unreasonably large or small number. L0G2E therefore does not have to be 
dimensioned higher than 300g = 192^^. 

The development of a criterion analogous to (3) when the prior probabilities 
are unequal is as follows. We suppose that each defined material b has a 
positive prior probability P(b), that the null class N has a positive 
prior probability P(N) and that 
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I P(b) = 1. 

all b 

including N 


The BAYES9 cri^rion in this case (given in Appendix A) is 


P(X |a) P(a) 8 

e H [0P(X.ja) + (1-0) IP(b) P(X |b)] 

^ i=l b 


(17) 


where the summation is over all materials and null. CriterionO.7) for the 
null class is 


8 


^ n [0 e + (1-0) I P(b) P(X.[b)] 


i=l 


(18) 


We choose null when 


c riterion (17) ^ 1 

criterion (18) 


i.e., when 


P(X |a) P(a) 8 WeJaHa-6) I P(b)P<xJb) 

n [ 1< 

i=l 0s + (1-0) f p(b)PCX.lb) 


as in the derivation of criterion(3) . Simplifying a little, we arrive at 
the criterion 

3 P(X^la) -f t I P(b)P(X.|b) 


P(X^ia) Q(a) H [ 


i=l e + t j; P(b)P(x Jb) 

b 


1 


(19) 


where Q(a) = P(a)/P(N) and t = (1-0 )/6. "1°S2 criterion (L9) is tested against 
-log^ £-5 when a map is made or when the recognitions of each material are 
counted up. 
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APPENDIX C 

IMPLEtrENTATION OF PRI0R9 AND PREF9 

In Sections 3.2 anti 3,3, we derived two new nine point rules, PREF9 
and PRI0R9. The classification criterion for PREF9 with respect to each 
material a is the answeir to the question "Given the data from the nine 
pixels, what is the probability that if we choose a pixel at random from 
the neighborhood it is material a?" This criterion written in mathematical 
symbols is 

|p(a|x^) +...+ |p(a|Xg) 

where X^,...,Xg are the data vectors from the center pixel and the eight 
neighboring pixels, respectively. The factor 1/9 is the same for all 
materials and can bo omitted. By Bayes formula, the PREF9 criterion is 


P(a)P(X^|a) 

S P(b)P(X^|b) 
b 


+....+ 


P(a)P(Xg[a) ■ 

I P(b)P(XgTb) 


In summary, the PREF9 criterion for material a is the posterior 
probability of material a averaged over the nine data values, 

PRI0R9 is a rule that uses these estimates of the probabilities of 
the different material as the prior probabilities for a usual Bayesian 
decision on the center pixel. This rule carries out the principle that 
the prior probability that a pixel represents a given material is dependent 
on the neighborhood in which the pixel is located. The criterion for PRI0R9 
is criterion (1) multiplied by the center likelihood P(X^|a): 


PCa)P(X^Ia) 

^ I P(b)P(X^jb) 
b 


P(a)P(Xg|a) 

s”^Cb)p(Xglb) 

b 


83 


2p> 


FOftMeHLT vyiUl-OW bun l.*eoRATOBlES. the UNIVERStTT QF MICHIGAN 


It was shown In section 3.2 that a reasonable null test can be derived 
by assuming that a null catagory N has a flat distribution of height s. K 
is chosen when its criterion (5) is larger than that of the winning material 
The test turns out to be equivalent to choosing N if 




E+ SP(X lb) e+ £P(X„|b) 

b ° b ® 


1 

e + SP(X |b) 

■L ^ 


e+ ZP(X |b) 
b ® 


e 

2 


(9) 


The e on the right side of (9) is called C2 bo remind us that it can be 
changed after PRIORS is run whereas the e on the left side must be specified 
before that run. 

Two strategies for calculating criterion (9) present themselves, 

The first is to store 

P(X^|a) 

^ia “ E+ S P(X. |b) 
b ^ 


as an integer for each pixel and each material a and 

e 

E + ^ P(X.|b) 
b 


as in integer in channel k+1. Criterion (9) is 


oa 


8 

I 

1=0 


oe 

E 


^ia 


8 

Pi- 

1'0 


E 


2 

2 
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i.e. 


8 

P Ip. 

xfO 

8 

^oe 

x=0 


Only the numerator would be calculated to determine the winning material 
(because the denominator is the same for all materials and therefore doesn't 
affect that decision) and then the denominator divided to perform the null 
test. 

The second strategy uses the relationship 


Ip + p . ^ = 1 


(i.e,, the posterior probabilities sum to 1.) Hence 
8 8 

=5- J X hb 


i=0 


b i=0 


Hence 


9- I I 


+ . . .+ 


b i=0 


ib 


E + I P(X^|b) E + I P(X„|b) 


Hence criterion (9) is 

8 

[a) I 


i^O 

8 


la 


I I V, 


b 1-0 


ib 


_2 

*^9 
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Saving channel k.4-1 for is no longer necessary, but now a line of 
center pixel vectors P(X^[a) must also be saved. The tradeoff in storage is 
between saving two lines with one extra channel vs saving one extra line of 
k channels. Timewise, the balance is between summing 9 integers p.^ and 

8 Ic. 

summing k integers I p.. and subtracting from 9. There is also retrieval 

i=0 

bookkeeping in getting at the values in the first strategy. 

A modification of the second strategy is most economical of storage and 
not costly of time* 

PCX3,|a) 

P. = 

+ Z J'CX.Ib) 

b 


is saved as an integer in k channels of every pixel in two lines. 

f = e + I P(X. [b) 
b 


is saved in floating point for each pixel of a single line, 
material is the one minimizing 


oa 


8 

1 

1=0 


xa 


The chosen 


( 20 ) 


as before. Because this criterion is mostly a sum of integers (a very rapid 
operation on the cbmputer)and because any decision criterion is calculated, 
for every material a at every pixel, this criterion takes relatively little 
time. 

The null test criterion (9) is calculated in the form 


^o£ ^^oa 

9- 1 


8 

•^0 

x=0 

8 

I 

i=0 


( 21 ) 
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for the winning material only. The square ^bracket factor in the numerator 

is the winning criterion (20), The sum J p., is calculated for the 

i=0 ^ 

criterion (20) of every material, so it is simple to keep the running total 

8 

I I Pib 

b 1=0 

The square bracket term and the denominator are each converted to floating 
point and then the floating point expression (21) is calculated. 

In order to express as an integer, the denominator of p^j^ 

E + I P(X lb) 
b 


is divided by 100000 and then stored as f 


This has the effect of 


multiplying by 100000. The denominator of (21) becomes 


900000 


8 

1 I 

b i=0 


The scaling cancels out for every factor in (21) but When is 

multiplied by the reverse-scaled the original scale is restored. 

In order that the null test can be performed at map-making time, the 
null test criterion is scaled by the transformation 

10^2 [criterion (21) ]+ 100 

and stored as an integer in an output channel. The transformation is quickly 

carried out by exactly the same method as was described in Appendix B. 

For PREF9, the criterion for deciding among materials is the averaged 

posterior probability ol material a 

8 

^ Pia 
i-0 

(the factor 1/9 has bean omitted because it is the same for all materials.) 
The null test is 
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8 

I p 

i-0 


ia 


5 - ^ ? Plb 

b 1 


( 22 ) 


and 'is scaled by the transformation 

. -8 log 2 Icriterirn (22)] + ZOO 

Because PRI0R9 requires the calculation of expressions used in PREF9, 
both rules are carried out in the same module containing a control variable 
that can be set to activate either or both of the two rules. If either 
rule is used alone, the number of the winning material is put out as channel 1 
and the transformed null test criterion as channel 2. If both rules are 
requested, PRI0R9 puts out channels 1 and 2, PREF9, channels 3 and 4 and 
QRULE, the usual one-point decision rule, channels 5 and 6. 

The input to the processing module PRI0R9 is the output of the one-point 
decision rule module QRULE, which can be run to put out -2 log^ (density of 
material b) for each material b in channels 3 through k+2, the smallest of 
these numbers (corresponding to the largest density) in channel 2 and the 
number of the material with the largest density (i.e. , the one-point rule 
choice) in channel 1, For each input number y^correspondlng to a material 
b, it is necessary to calculate P(x|b) = e 2 ^ . Because y is an integer 
between 0 and 511, this calculation is very rapidly accomplished by referring 
to the element of a precalculated vector of length 512. These remarks 
about input also apply to the module BAYES9 (see Appendix B.) 

We thought thac the user would select both PREF9 and PRI0R9 only when 
the principal goal was experimental comparison of the two rules, and in that 
case, it would be convenient to have the QRULE results also on the same tape 
to minimize the number of Cape mounts in map-making and to allow for the 
computation of agreements and disagreements among the three rules; hence the 
passing along of Che QRULE results into channels 5 and 6. 
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The division of roles between e and criterion (9) could have 


been made another way, namely 

P(X^U) 

P(X la) t — - 

e + I P(X_|b) 


+ . . .+ 


P(Xgja) 


E +IP(Xg[b) 
b 


< 23 ) 


£ + I P(X^lb) 

b 


+. . .+ 


e: + I P(x„|b) 

b ° 


We will now show that this different division of roles results in the same 
numerical procedures and hence effectively the same null test as previously 
derived. 

Following the first implementation strategy, test (23) can be written 


8 

I 


p . 

oa xa 

x=0 


p _ 8 

oE y 

i=0 


Pie 


1. e. , 


8 

Poa P±a 

x=0 


8 

Poe I PiE 

x=0 


which is the same as before except that the right side was E^/^^ • If £ is 
set theoretically and not empirically, then = C and both tests are identical. 
If E^ is set empirically then the right side is set empirically and it makes no 
difference what it is. In other words, the right side is set to make a good 
looking map or to leave unclassified a certain respectable percentage of 
points, so all that matters is what null test criterion is stored in output 
channel 2, and that is the same for both formulations of the null test. 
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Following the second implementation stragety, test (23) is 


8 


P(X„ta) I P 


i=0 


xa 


9 - I 1 


?4 


b i=0 


ib 


which is the same as before except that the right side was e^/e: and the 

remarks made about the comparison of first strategy tests apply. 

We tried PR10R9 and PREF9 run jointly with QRULE on 6-channel data 
(as a subset of 10 original channels) from 60 lines with 200 points per line. 
The run took only 12% longer than QRULE run alone. (This compares with 11% 
longer for BAYES9 and 16% longer for the outmoded BAYESS) . No significant 
saving of time would result from running PRI0R9 or PREF9 alone because all 
the most time-consuming calculations would still have to be made. 
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APPENDIX D 

A PLAN TO TEST AND EVALUATE NINE-POINT RULES ON SATELLITE DATA 

The purpose of the test is to compare the performance of nine-point 
classification rules on satellite data* The test will be carried out on 
data from the LACIE intensive study sites. It will determine which rules 
best recognize wheat, whether their recognition is better than that of 
the usual one-point rule QRULE, and if so, how much better. 

To economize on computer running tia.e, the test will be carried out 
in three stages. The first stage will be an exploratory comparison of 
all the rules on two LACIE intensive study sites. This exploratory stage 
is necessary because the performance of nine-point rules on LANDSAT data 
has not previously been tested. After considering the results from the 
first stage and from previous testing on aircraft data, we will choose no 
more than three nine-point rules for extensive testing in the second stage. 

The third stage will be to test the second-stage rules as acreage estimators. 

In the first stage, an attempt will be made to find two LACIE 3x3 mile 
intensive study sites on which the usual classification rule QRULE does not 
give perfect results, so as to allow room for improvement. Pixels that 
are clearly interior points of fields will be Identified; these are the pixels 
for which we are confident of the ground truth. Half of these field interiors 
will be designated as training sets and the rest as test sets. The training 
sets will be clustered and combined into' a small number of signatures for 
each material. This set of signatures will be called SIGS 1, The same 
data set will be used twice by designating the former test sets as training 
sets and the training sets as test sets, clustering the new training sets 
and combining them into a set of signatures SlGS 2. To preserve 
resolution, the data will not be rotated. 

The data from the 73 x 105 pixel rectangle enclosing the intensive 
study site will be processed by QRULE, using first SIGS 1 then SIGS The 
result is an output tape with two files, corresponding to SIGS 1 and SIGS 2, 
r 'sspectively, with the QRULE choice in channel 1, the QRULE null test 
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criterion in channel 2, and in the succeeding channels, a normal density 
for each material. These densities are input for all the nine-point rules. 

The processing time for such a. small rectangle is negligible compared to 
the time for identifying the pixels interior to the fields. 

The QRULE results xd.!! be tallied by running the processing module 
TALLY on the QRULE output tape, if the results are so good that they allow 
little room for improvement, then it would be a waste of time to test whether 
other decision rules improve performance. Therefore either the signatures 
would be degraded by combining them into one signature for each material 
or another intensive study site or the same one at a different time of 
year would be chosen. 

If the QRULE results allox^ room for improvement, then the enclosing 
rectangle will be processed by the modules LIKE9 with m=9, AVE9 with t=l, 
V0TE9, PREF9, PRI0R9, and BAYES9 with 6 = .1, .3, .5, and ,7. By storing 
the programs in separate core loads, this processing can all be done in 
one job with the QRULE output tape as input and an output tape with 18 
files of pixel identifications, two files (for SIGS 1 and SIGS 2) per rule. 

The module TALLY will be run on this processing output tape to coixnt the 
number of identifications of each material in each field interioi and punch 
these numbers on cards. The cards will be input to a program DISPLAY to 
average the results and display them in tabular form. The results will 
be given separately for training and test fields, and in another breakdown, 
for large fields and small. The measure of performance displayed will be 
the percent raisclassif ied. The total processing time will not be lengthy 
because the enclosing rectangle (73 x 105 pixels) is so small. 

The procedure just described is applied to two intensive study sites 
Then we will look at the results from the two sites and, with less attention, 
to previous aircraft resultss to decide which rules are promising enough to 
merit more lengthy testing. Not more than three will be chosen. 

The procedure for the second stage will be the same as for the first 
except that there will be more sites tested and fewer rules. We would attempt 
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to run tests on enough sites to determine whether the best nine-point rules 
significantly outperform QRULE in the wheat study and estimate how much 
improvement is to be expected. If, as seems likely, BAYES9 is one of the 
finalists, we will try to estimate an optimal value or range of values for 
the parameter 6 . 

In the third stage we compare the performance of the nine-point rule 
finalists and QRULE as acreage estimators. The pixels of the site will 
be assigned to four rectangles as shown: 



The rectangles will be located on photographic enlargements and the 
wheat area measured by planimeter for each rectangle. This area will be 
divided by the area of the rectangle and then multiplied by the number of 
pixels in the rectangle to obtain the true wheat acreage (except that number 
of pixels is ured as a measure of area rather than acres), QRULE and the 
nine-point rule finalists will be compared as acreage estimators in two 
ways. The first way will be for the rules to classify each pixel and count 
the wheat pixels in the rectangle. The second way will be to estimate the 
posterior probability of wheat for each pixel and sum these probabilities 
over the rectangle, 

The results will be compared on the same site data used in the first 
two stages so that differing relative performances of the rules as classifiers 
and acreage estimators can be noted. Some reprogramming of the decision 
rule modules to function as acreage estimators will be needed. 
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APPENDIX E 

DESCRIPTION OF HYPOTHESIS TESTING TECHNIQUES 
A, Field Building Algorithm 

The scene to be processed is first decomposed into a set of samples, 
each of which is a square array of points. Beginning with the first row 
of samples, the statistical similarity of adjacent samples is tested. Two 
samples are combined to form a pixel group when they are determined to be 
statistically similar. This group is then treated as a new, larger sample 
and is tested against the next sequentially occurring sample, A sample 
which is not similar to the pixel group to which the previous sample belongs 
is assigned to a new pixel group. The entire first row of samples is 
thus combined into a series of pixel groups, the samples in each group 
having been determined to be statistically similar. A group may contain 
only one sample or it may contain many samples . 

Succeeding rows of samples, beginning with the second row, are processed 
by testing each sample against the pixel group which has, as one of its 
members, the sample in the preceding row directly corresponding to the 
test sample. Samples in two consecutive rows are directly corresponding 
when they are both displaced by the same number of samples from the beginning 
of each row. If the sample Is not similar to this group and is not the 
first sample in the row, it is tested against the pixel group to x^hlch the 
preceding sample in the row belongs, if that sample has been assigned to 
a pixel group. If the sample passes the statistical similarity test, it 
is assigned to that group. If the test is not satisfied or if the preceding 
sample has not been assigned to a pixel group, the current sample is not 
assigned to any group and processing continues to the next sample. If the 
sample is similar to the pixel group to which the corresponding sample 
in the previous row belongs, it is assigned to that pixel group. Also, 
if the sample preceding the current sample has not been assigned to a 
pixel group, this preceding sample is tested against the group to which the 
current sample was just assigned. If they are not similar, the iireceding 
sample remains unassigned, and processing continues to the next sample 
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following the current sample. If the preceding sample is similar, however, 
it is added to the pixel group to which the current sample was just assigned. 
This testing of preceding samples continues hack along the row of samples 
until either a sample is encountered which is not similar to the group to 
which the current sample was assigned, or a sample is encountered which 
has already heen assigned to a pixel group, or there are no more remaining 
samples to be tested. 

Each row of samples is processed in this manner until the. last sample 
in the row has been processed. At this point, the current rov7 of samples 
is scanned hack along the row until a sample is encountered which has not 
been assigned to a pixel group. The conditions under which unassigned 
samples may exist were explained previously, T^hen an unassigned sample 
is encountered, it is assigned as the first element of a new pixel group. 

If the next sample moving back along the row is also unassigned, it is 
tested against the just formed pixel group. If it is similar to this 
group it is assigned to it. If it is not similar. It is used to initiate 
another new pixel group. Thus, whenever a consecutive series of unassigned 
samples is encountered, the first sample of the series is either assigned 
to the pixel group to which the preceding sample moving back along the row 
was assigned or it is used to initiate a new pixel group. .After processing 
a row of samples in both the forward and reverse directions, all of the 
samples in the row have been assigned to a pixel group. 

Once the entire scene has been processed in this manner all of the 
pixels have been assigned to contiguous sets of pixel groups. Ideally, 
each such set will correspond to one agricultural field, A closed field 
boundary is formed around each of these groups by those pixels which are 
part of the group but of which at least one of their eight iinmediate neighbors 
■belongs to another group. Essentially, the closed boundaries form the 
outline in the scene of each of the pixel groups, 

B. First“Order Similarity Test 

Two statistical tests have been used by LARS to test the similarity 
between the multispectral data values of a sample and a pixel group. One 
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test uses first order statistics only, to test the hypothesis that the 
means of both are equal. Also included in this test in an empirical 
check on the variances of both. For one channel of data, the test proceeds 
as follows. 


Let Xj^) be the one-channel data values of the points in the 

sample. Also, let be the data values for the points in the pixel 

group. Here, N is the number of pixels in the sample and P is the number 
of pixels in the group. The mean of the sample, and the mean of the 
group, My, are given by: 





£ 

i-1 


X. 

1 


^ = P 


P 

Z X. 
i-1 


The sum of the sample values squared, and the sim of the group values 
squared, Q^, are given by: 


Q 


X 


N 

S 

1=1 


a.) 


2 


Qy 


P 

Z 

i=i 



The normalized sum of squares of the sample values, NSSjj., and the normalized 
sum of squares of the group values, NSSy, are given by; 


1=1 


F 2 n 

NSS^ = I = Qy - P(My)^ 

i=l 

The pooled estimate of the variance of the sample values and the group 

values, V , is given by 
P 

V _ ^^®X 
P 


(N-1) + (P-1) 
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^CAL 


A two-tailed t test is performed by comparing a calculated t value, 
, against a critical value, 


^CAL A 


p'N 


For a sample to be determined as statistically similar to a pixel group, 

the following relationship between the critical t value, t_^^__, and t„.,. 

CRXT CAXa 

must be satisfied in each channel: 


J CAL 

The value, t 


<t 


GRIT 

,, is determined from a table of percentage points of the t 


■GRIT’ 

distribution and is based on the confidence level chosen and the 
number of degrees of freedom, given by the quantity, (N-1) + (P-1). 

In addition, the variance of the sample values, given by NSS^^/N, anH 
the variance of the group values, NSS^/P, must satisfy the following: 


/NSSjj/N <.15 Mjj 
/NSSy/P < .15 My 


This check on the variance requires that the standard deviation be less 
than 15% of the mean value in each channel. The constant value of .15 
was derived empirically by LARS. 


G. Second-Order Similarity Test 

The second statistical test which has been employed by LARS to 
determine the similarity between a sample and a pixel group uses both first- 
order and second-order statistics. The test of the hypothesis that the 
means are equal is exactly the same as in the method just described. The 
difference results from the fact that this second method also tests the 
hypothesis that the variance of the sample is equal to the variance of 
the pixel group. 

A two-tailed F test is performed by comparing a .calculated F value, 
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^CAL* ^ critical F value for both tails of the F distribution, 

^CRITLO ^CRITHI* ^CAL given by the expression; 


F 

■^CAL 


NSS^ 

KSS^ 


M“1 

P"1 


The value of is determined from a table of percentage points of the 

F distribution and is based on the confidence level chosen, and the degrees 
of freedom, I'N"!) and (P-1) . For a sample to be assigned to a pixel group 
the following conditions must be satisfied in each channel: 


I^CAlI ^CRIT 

^CRITLO < ^CAI* < ^GRITHI 
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APPENDIX F 

DERIVATION OF DISTRIBUTION OF BOUNDARY LOCATION ESTIMATE 

The received signal r(y) can be expressed in terms of the true 
boundary location as 

r(y) “ s(y-Yo) + n(y) 

The expression can be substituted into the partial derivative of the 
log-likelihood function to obtain the maximum likelihood estimate condition 


•Y 

rY 

sCy-Yg) s'*Cy-Y)dy + 

n(y)s'*(y-Y)dy - 

-Y 

-Y 

¥ 


rY 


s<y-Y)s'(y-Y)dy = 0 


-Y 


For a high signal-to-noise ratio, y will be approximately equal to y^» 
the true value. The first term in the above expression can then be expanded 
in a Taylor series about the value y = y^. Making a change of variable 


sCy-Yo)s"Cy-v)dy = 


-Y 


Y- 


s(y+y-Y )s'(y)dy 

^ U 


j-Y-Y 


and using the first two terms of the Taylor expansion (additional terms are 
assumed neglibible) , this quantity equals 


-y-y 

-y-y 


s(y)s'*(y)dy + iy-y^) 


Y-y 


[s''Cy)3 dy 


-Y-y 


The original equation can now be written as 


fY-Y 

# 

-Y-y 

y 


s(y>s’'(y)dy -t (y~Y^) 

V O 


Y-y 


Is'*Cy)J dy + 


-Y-y 


n(y)s'(y-Y)dy 


-Y 


s(y-Y)s‘*(y-Y)dy = 0 


-Y 
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The first and last terms on the left-hand side cancel out and the quantity. 


(Y-Y^), can be solved for. 


Y „ Y = 
o 


J. 

n 

-Y 


(y)s'(y-y)dy 


n 

-y-Y 


[s^(y)3 dy 


The estimate y equals a constant plus a linear function of the noise n(y). 
The noise is a Gaussian random process and, therefore, the estimate y is a 
Gaussian random variable. The mean of y is given by 

✓V 

E {y} = y^ 

The estimate is, therefore, unbiased. 

The variance of y is given by ’ 

V {y} = E { (y-y^)^} 


E {[ 


nCy)s'*(y-Y)dy] } 


-Y 


Y-Y 


[s'(y)]^dy]^ 


-y-y 


The numerator of this expression is equivalent to 
Y Y 

E {| I n(u)n(v)s''(u“Y)s'*(v-y)dudv} 

-Y -Y 

which equals 
Y Y 

J / ^n ® ^ ^‘^■"Y ) s ^ (v-y ) dudv 

-Y -Y 
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Here, 

noise 


R (u-v) is the correlation function of the noise, 
n 


N 

= -f- (y) 


For white, Gaussian 


and the double integral is equal to 

/V 




[s'Cy)] dy 


_Y-Y 

✓V A 

Because y is approximately equal to y^, the variance of y can be written as 


vCy) = 


N 

o 

y~y 

f 

-Y-Y, 


Es"(y)] dy 


For the forms of the signal s(y) which are of interest in this application, 
s"(y) is equal to zero within a short distance of the point y=0. Provided 
y^ is not too close to the endpoints of the interval, the denominator term in 
the estimate variance will be a constant and the variance can be written as 


VCy) = 


N 

o 

Y 

2j Es-'Cy)]^dy 
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The minimum variance of any estimate y boundary location is 

given by the Cramer-Rao bomd as 


V -„Cy) 
mxn ' 



Substituting for the partial derivative, 



“ i 

Y 

" 

= 

; 1 , 

Ir (y)-s<y-Y) Is" Cy^ )dyj^ 
-Y 



Using the relation that 


n(y) “ r(y) - s(y-y) 

and the correlction function of the white noise, the minimum variance is 
given in the high signal-to-noise ratio case as 

V (y) = — T ' . 

J-Y 


This is precisely the variance which has been derived for the maximum 
likelihood estimation procedure. Given the assumptions which have been made, 
no other estimation procedure will result in a lower variance for the 
estimated boundary location, The maximum likelihood procedure is optimal 
in the sense of minimim variance. 
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GLOSSARY 


a 


AVE9 


h 

BAYESS 


Letter denoting a material to be recognised by multispectral 
processing. It often refers to the material chosen by the 
decision rule. 

A classification rule that averages the nine data vectors in the 
3x3 pixel neighborhood and then applies the one-point rule QRULE 
to the result. To lessen sensitivity to alien points, the t 
largest and t smallest data values in each channel are deleted and 
the remaining values averaged. Sometimes called the "moving 
average" rule or the "trimmed mean" rule. 

Letter denoting a material to be recognized by multispectral 
processing. 

A classification rule similar to BAYES9 except that the smallest 
square bracket factor in an old form of the decision criterion 

P(X^|a) Jl [F(X^ia) +1^ IP(X.jb)] 
i=l b 


is omitted 

BAYES9 A classification rule based on the assumption that a pixel probably 

represents the same material as its neighbor, the degree of 
dependence specified by a parameter 0 between 0 (independence) and 
1 (complete dependence). The decision criterion is 

. P(X |a) + S I P(xJb) 

n I 

i=l e+sJP(X.|b) 
b 

criterion An expression calculated by a classification rule for each material 
at every pixel. The material with the biggest criterion (sometimes 
the .5ir<allest) is the one chosen. A null test criterion is an 
expression which, when smaller (or sometimes larger) than a 
prescribed constant, signals a decision "none of these". 


PCx^fa) 
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DATUM 


error rate 


exponent 


A channel vector that is passed from module to module of the ERIM 
multispectral processing system. It starts out being the vector of 
multispectral data values at each pixel. After application of the 
one—point processing module QRULE, DATUM(l) is the QRULE choice, 
DATUM(2) the QRULE null criterion and DATUM(3) . . .DATUM(kH*2) are 
the exponents of the multivariate normal density conditional on the 
materials being recognized (see "exponent".) After applying a nine- 
point rule, DATUM(l) is the material chosen and DATUM(2) the 
null criterion. 

The error rate for one field is the percent misclassif led. The 
error rate of a group of fields is the arithmetic average of the 
error rates of the fields. 

The multivariate normal density can be written 


- |[(X-u>'^ r“^(X-u) + iog^lR]) 


(2ir) 


k/2 


k 

LACIE 


LACIE inten- 
sive study 
site 


where u is the mean vector and R the covariance matrix. The 
expression In square brackets is sometimes referred to in this 
report as the "exponent". 

The niimber of materials to be recognized for which distributions 
have been specified. 

Large Area Crop Inventory Experiment, an experiment to estimate from 
satellite data the acreage of wheat grown In various wheat-producing 
countries. 

A 3 X 3 mile area where LACIE data from the LANDSAT satellite 
is correlated with ground truth. 


LAKDSAT The new name of the ERTS satellite that provides data for measuring 

earth resources. 

level The decision rules BAYES9, PRI0R9, and PREF9 decide null (i.e. , 

"none of these") when the winning material density is less than a 
prescribed number e. Level is the probability that a legitimate 
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LIKE9 


L0G2E 


m 

MAD 

nine-point 

rule 

null test 


point from a material distribution will be so rejected. The 

relation between e and level is 

-1/2 (EXPLIM + log [r|) 
e = e ® 

where {r[ is the determinant of the covariance matrix and EXPLIM is 
the value in the chi-square table corresponding to the row NCHAN 
(the nxjmber of channels of data) j ’ld the column "level". 

The maximum likelihood classification rule derived from the assump- 
tion that the nine pixels in the 3x3 pixel neighborhood are an 
independent random sample from a normal multivariate distribution. 

It amounts to adding for each material, the nine exponents (see 
"exponent") and then choosing the material with the smallest sum. 

To prevent occasional alien points from disturbing the decision rule, 
LIKE9 is modified to sum only the m smallest exponents, where 
m l,...9a 

A table used in the rapid calculation of log 2 as shown in Appendix 
A. It presents the result of subtracting a 3-octal-digit number 
from 200g and shifting left nine bits. 

The LIKE9 criterion is the sum of the ra smallest among the nine 
exponents. See LIKE9. 

Michigan Algorithm Decoder is a computer source language resembling 
ALGOL. 

A nine-point classification rule decides what material to assign 
to a pixel on the basis of data from that pixel and its eight 
immediate neighbors. 

A provision of a classification rule to decide "none of these", 
i.e., to decide that a pixel represents none of the materials for 
which distributions have been specif ;led. The null teat used in 
BAYES9, PRI0R9 and PEEF9 is based or v.he assumption that materials 
in the scene not given a specific r. . zribution may be lumped together 
in a null category that has a flat .Xstribution of height t. When 
this distribution wins, a decisio \ ' aone of these" is made. 
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P 



pixel 


posterior 

probability 


"probability that". P(X|a) means "probability of data vector 
X given that the pixel represents material a". P(a|x) is 
"probability that the pixel represents material a given that the 
data vector is X." 

probability that neighboring pixels represent the same material. 
The probability P(a|X^) that the i of the nine-pixels is a given 
that the prior probabilities are equal and the data vector is X^. 
By Bayes* theorem it is 

P(X.ja) 

e+ I PCxJb) 
b 


a resolution element of the multispectral scanning system. The 
continuous signal from a scan line is sampled to make it suitable 
for digital processing. The patch of ground corresponding to 
one sample is a pixel. 

Given the prior probability P(a) that a pixel represents material a 
and a known probability P(x|a) of getting X as the data vector given 
that the pixel represents a, the posterior probability P(a|x) that 
the pixel represents a given that the data vector is X is 


P(ajx) 


P(a) P(x!a) 

I P(b) P(x|b) 
b 


by Bayes formula. The posterior probability of a is thus the 
prior probability modified by information from the data vector X. 
PREF9 A classification rule that uses the criterion "sum of the nine 

individual posterior probabilities", which is 
8 8 P(xja) 

^ pi.' ^ — H — r 

i=o 1=0 e + lP(x.ib) 

b 


prior 

probability 


The prior probability P(b) that a pixel represents material b is the 
probability of b estimated without reference to the data from 
that pixel. In the absence of other information, the prior probabilities 
are often taken to be equal. 
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PRI0R9 


QRULE 


s 

SAVE9 


Signature 


SIGSl 


A classification rule embodying the principle that the prior 
likelihood that a pixel represents a given material is dependent 
on the neighborhood in which it is located. Specifically, the 
averaged posterior probability over the 3x3 pixel neighborhood 
becomes the prior probability for a Bayesian decision on the center 
pixel. The decision criterion is 


oa 


8 

1 

i=0 


^ia 


and the null test is 


oa 


8 

I 

i=0 


la 


b 


e + I PCX^Ib) 

b 



The "quadratic rule", i.e., the usual, one-point maximum likelihood 
classification rule. It puts out the one-point decision as 
DATDM<1) , the one-point null criterion as DATUM(2) , and in use 
with nine-point rules, is run to put out an exponent for each 
material in DAT0M(3) DATUM (k+2 ) . 


1 - 0 


where k is the number of materials and 0 is the dependence 


0(k+l) 
parameter of BAYES 9 

An array that saves two scan lines to provide the data for the 3x3 
array used in nine-point rules. See Appendix B. 

The mean and covariance matrix that specify the multivariate normal 
distribution of a material, 

A choice of 20 training fields from the 42 Imperial Valley field 
interiors used to test nine-point rules. Alsi> refers to a choice of 
training fields in the proposed plan to test nine-point rules on 
satellite data. 
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SIGS2 


TALLY 

t: 


X 

V0TE9 


E 


0 


A choice of 20 training fields from the 22 Imperial Valle;- lields 
not in SIGSl. Also refers to a similar arrangement in the proposed 
plan to test nine-point rules on satellite data. 

A processing module that counts the number of recognitions of each 
material in a prescribed rectangle. 

In the moving average rule AVE9, the t largest and t smallest data 
values in each channel are discarded or "trimmed” before the data 
values are averaged. This minimizes the disturbance of a noisy 
data vector. 

The vector of data values at a pixel 

A classification rule, applied after one-point decisions have been 
made on the nine pixels, which assigns to the center pixel the 
material most frequently recognized in the nine pixels. 

A quantity used by the null tests for BAYES9, PRI0R9, and PREF9. All 
materials in the scene that don’t have a specific distribution are 
lumped into a null category that has a flat distribution of height 
6. The rules are applied as if this null distribution were a 
legitimate distribution when it wins, a decision "none of these" is 
made. 

The BAYES9 null test, for example, is to decide null if 

g PCC.^|a) +S IP0!:.^ib) 

n [ ^ 

i=i £ + s ^ POL 

b 

Theoretically, same as £, but because it can be reset 

after BAYES9 processing and £ cannot, it is convenient to refer to 
it by a different symbol. 

The dependence parameter of the classification rule BAYES9. 9=0 
means neighboring pixels are completely independent. 6 = 1 means 
neighboring pixels represent the same material. 0 between 0 and 
1 means neighboring pixels probably represent the same material. 
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